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ABSTRACT 

Lagrangian reconstruction of large-scale peculiar velocity fields can be strongly af- 
fected by observational biases. We develop a thorough analysis of these systematic 
effects by relying on specially selected mock catalogues. For the purpose of this paper, 
we use the Monge-Ampere-Kantorovitch (MAK) reconstruction method, although any 
other Lagrangian reconstruction method should be sensitive to the same problems. We 
extensively study the uncertainty in the mass-to-light assignment due to incomplete- 
ness (missing luminous mass tracers), and the poorly-determined relation between 
mass and luminosity. The impact of redshift distortion corrections is analyzed in the 
context of MAK and we check the importance of edge and finite- volume effects on the 
reconstructed velocities. Using three mock catalogues with different average densities, 
we also study the effect of cosmic variance. In particular, one of them presents the 
same global features as found in observational catalogues that extend to 80 /i~^Mpc 
scales. We give recipes, checked using the aforementioned mock catalogues, to han- 
dle these particular observational effects, after having introduced them into the mock 
catalogues so as to quantitatively mimic the most densely sampled currently available 
galaxy catalogue of the nearby universe. Once biases have been taken care of, the typi- 
cal resulting error in reconstructed velocities is typically about a quarter of the overall 
velocity dispersion, and without significant bias. We finally model our reconstruction 
errors to propose an improved Bayesian approach to measure in an unbiased way 
by comparing the reconstructed velocities to the measured ones in distance space, 
even though they may be plagued by large errors. We show that, in the context of 
observational data, it is possible to build a nearly unbiased estimator of using 
MAK reconstruction. 

Key words: dark matter — cosmological parameters — methods: analytical and 
numerical — galaxies: distances and redshifts 



INTRODUCTION 

Galaxy redshift catalogues provide us with the radial veloc- 
ities of the galaxies, 

cz^Hor + Vr, (1) 

which are partly due to the global Hubble expansion {Hq r 
with Ho the present value of the Hubble parameter) and 
partly due to the line-of-sight components of the peculiar ve- 
locities (vr). Peculiar velocities are the deviations of galaxy 
velocities from the uniform Hubble expansion, due to the 
non- homogeneous distribution of matter in the Universe. 
The peculiar velocities are thus tracers of mass distribu- 
tion in the Universe and can have far-reaching implications 
for cosmology. As tracers of dark matter, peculiar veloci- 



ties can be used to determine the local and global distribu- 
tion of dark matter. From expression ([T]), it is evident that 
observations of galaxy redshifts (z) supplemented by mea- 
sure of radial distances (r), would yield the peculiar veloci- 
ties. However, measuring distances is a non-trivial exercise. 
The TuUy-Fisher relation, surface brightness fluctuations, 
the Faber- Jackson relation for ellipticals (and their siblings, 
including the fundamental plane and the Dn — cr methods, 
the Tip of the Red Giant Branch, Cepheids, and SNIa are 
the most usual methods for obtaining distances. The data 
gathered is however rather sparse: out of about a million 
galaxies whose redshifts are presently known with surveys 
such as 2dF and SDSS, the distances to only a few thou- 
sand have measured distances. Moreover, distances for most 
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of these galaxies have too large peculiar velocity errors (due 
essentially to errors in distance measurements) to be useful 
in studying dynamics. For instance, distance indicators such 
as the Tully-Fisher relation suffer from 20% relative distance 
errors and thus produce quite noisy measurements at rela- 
tively moderate redshifts {i.e. cz > 3000 km s~^). The data 



also suffers from selection biases (Strauss & Willick 1995 
[Tully Pie rce|2000t . One way of reducing the error bars on 



distances is to average over many distance measurements for 
galaxies in clusters or groups and also by combining the re- 
sults from different distance estimators. This treatment de- 
creases the error bars on distances to about 8% relative dis- 
tance errors (Tul ly et al.|2007i . Even though all these diffi- 
culties can be surmounted, one can finally hope to only have 
a sparse sample (as compared to redshift samples) of radial 
components of peculiar velocities. Fortunately, we now have 
Lagrangian velocity reconstruction schemes that are based 
solely on current redshift positions of mass tracers. The re- 
constructed velocities depend on cosmological parameters. 
Thus, comparing predictions obtained through Lagrangian 
reconstruction algorithms and the measured velocities may 
give estimations of these parameters. 

This brings us to the main point that this paper tries 
to address: developing a robust and unbiased method of La- 
grangian peculiar velocity reconstruction using redshift cat- 
alogues, in particular when observational effects distort most 
of the required data needed for the reconstruction of the dy- 
namics. The reconstructed velocities are then compared to 
the measured ones using an ad hoc algorithm to yield a mea- 
surement of Qm, the mean matter density of the Universe. 

Throughout the paper, we will try to mimic observa- 
tional effects as they appear in the most densely sampled 
currently available galaxy catalogue of the nearby universe 
which has been compiled by one of the authors (R. B. Tully). 
This galaxy catalogue is built from different sources such 



asZCAT dHuchra et al.||T992] ) and SSRS ( da Costa et al. 
1988). Only galaxies for which cz ^ 8000 km s~^ have been 



introduced in the catalogue. This catalogue is named NBG- 
8k, standing for NearBy Galaxy catalogue with a depth of 
8000 km s~^. Although selection criteria for this catalogue 
are not well defined, it will prove to be useful for the study 



of smaller galaxy catalogues such as NBG-3k (Tully et al, 
2007t . 



For the purpose of this paper, we use a recently devel- 
oped technique, called the Monge-Ampere-Kantorovitch re- 
construction method (MAK hereafter), which is an approx- 
imation to the full non-linear dynamics to trace orbits back 
in time. This is a Lagrangian method, such as PIZA (| Croft | 
& Gaztanaga 1997) or the Least- Action method (Peebles 



1989), and not a Eulerian technique such as, e.g., POTENT 



( Bertschinger & Dekel 1989). One must note that the re- 
sults of this paper are also valid for the other Lagrangian 
reconstruction methods as all the effects we are going to 
analyze are explainable in terms of gravitational dynam- 
ics. The MAK reconstruction has already been largely dis- 



cussed when applied on numerical simulations (Mohayaee 
Brenier et al. 2003) . It is based on assuming 



et al. 2006 



that the dark matter displacement field is convex and po- 
tential, i.e. irrotational. In doing so, we exclude displace- 
ment fields which include multistreaming regions. The main 
result is that it is then possible to reconstruct accurately 
and uniquely the displacement field of dark matter particles 



between their original position and their current position. 
Practically, to solve the MAK problem, one must minimize 
a cost function for the assignment of a dark matter particle 
at the present comoving position and its initial comoving 
position q^: 



(2) 



If the Universe is assumed to be initially homogeneous, 
which is a fair hyp othesis supported b y CMB data (e.g. 
WMAP first year in [Bennett et al.|2003| )p]then qj must be 
distributed on a uniform grid and the solution to the MAK 
problem is unique and given by the assignment a which 
minimizes So-- The derived solution is then necessarily ir- 
rotational and derives from a convex potential. To solve this 
problem, we have implemented a paralle l version o f the s o- 
called "auction" algorithm proposed by 



Bertsekas 



(19791 



Of course, as we are using an approximation to the dynam- 
ics, the solution to the problem will be only valid above 
some scale (typically a few /i~^Mpc). Once the solution is 
found, the immediate output of MAK reconstruction is the 
nonlinear displacement field ^(q) = x(q) — q, which can be 
used to find the peculiar velocity field v using the first-order 
Zel'dovich approximation: 



(3) 



where the subscript i indicates the comparison is achieved 
on the corresponding field averaged over the object i {i.e. in 
a Lagrangian way), and the linear growth factor p ~ 
dBouchet et"aL]|1995| ). This best fit for (3 is valid as soon as 
+ = 1, being the present dark energy density. 
It appears then that a direct comparison of against v^ 
should in principle give us (3 and thus Qm- Though naive 



measurements (Mohayaee & Tully 2005) and preliminary 
studies (Branchini et al.||2002 Phelps et al.||2066 ) on mock 
redshift catalogues have already been tried, the observa- 
tional biases and systematic errors in the velocity-velocity 
comparison have never been studied thoroughly. 

This paper is organized as follows. In Section [l] we de- 
scribe the simulation and the basic mock catalogues that 
are used in the rest of this paper. Subsequent mock cata- 
logues integrate more and more observational features but 
are still based on the same original basic mock catalogues 
presented in this section. Section |2] gives a model for the 
error distribution on MAK velocities and discuss the first 
problematic features of the comparison between MAK and 
measured velocities. This error distribution helps us in par- 
ticular to establish the likelihood analysis in Section [6] We 



[Brenier et al.| ( [2003| actually shows the uniformity is even re- 
quired to prevent singularities in the solution of the Euler-Poisson 
system of equations. 

^ We implemented a parallel version for shared-memory super- 
computers and MPI clusters. On the Magique2 cluster, it needs 
50 minutes on 2 processors to solve the assignment of 74000 par- 
ticles. The algorithm is already sparse, i.e. it only looks for can- 
didates for assignment in a limited region of the catalogue. The 
MPI efficiency is here optimal using 2 processors. It must be noted 
that the time complexity depends highly on the catalogue that 
is being reconstructed. For a given catalogue, the time needed to 
solve the assigment problem increase as A^2.25 -^i^ji ^j^g number 
of particles. 
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go then to the first main topic of this paper in Section [3] 
by studying the systematic errors introduced by arbitrary 
mass-to-light assignments in redshift catalogues. This sec- 
tion includes a study of missing mass correction (§ 3.1), 
unknown M/L function (§ 3.2) and incompleteness effects 
(§ |3.3[ technical details are given in Appendix [Ct . In Sec- 
tion^ we discuss the problem of redshift distortions and the 
way to account for it during the MAK reconstruction. Sec- 
tion [5] is devoted to the handling of finite volume and edge 



effects, i.e. issues related to the zone of avoidance (§ 5.1), 



the choice of the Lagrangian volume of the reconstruction 



(§ 5.2), and finally the so-called cosmic variance (§ 5.3). 
The last section (§ [6| of this paper investigates the effect 
of distance measurement errors on the comparison between 
reconstructed and measured velocities, and proposes a max- 



imum likelihood estimator (§ 6.2) to account for them in 
the measurement of Qm- Results given by this estimator are 
then discussed in ^ 16.31 



1 MOCK CATALOGUES 

To study various effects and systematic biases on the MAK 
reconstructed velocity field, we generated a number of mock 
catalogues extracted from a A/'-body simulation (§ |1-1[ ). Al- 
though many recipes will be employed later to address vari- 
ous observational biases, we will always start from the same 
threejf] 



1.2 



The 



"main" halo catalogues as described in 
first catalogue aims to reproduce to some extent the main 
features of the local universe, in particular the presence of 
a large cluster at about 40 /i~^Mpc and a super-cluster at 
about 70 /i~^Mpc. The second and the third catalogues have 
less salient features but represent locally overdense and un- 
derdense realisations in order to address the problem of cos- 
mic variance. 



1.1 The A/^-body sample 



Our 128 particles A^-body sample (Mohayaee et al. 2006) 



was generated with the public version of the A^-body code 
HYDRA (Couchman et al. 1995| ) to simulate coUisionless 
structure formation in a standard ACDM cosmology. The 
sample covers a comoving volume of 200"^ /i""^ Mpc"^. The 
mean matter density is f^m = 0.30 and the cosmological 
constant Qa — 0.70. The Hubble constant is Hq — 65 km 
s~^ Mpc~^. The normalisation of the density fluctuations in 
a sphere of radius 8 Mpc, is as = 0.99. We note that this 
value of (78 is significantly larger than the value suggested 



by present WMAP data which sets as = 0.74 (Spergel et al 



2006), but this should not affect significantly the results pre- 
sented in this paper. In fact, a lower as compared to 0.99 
would reduce both non-linearities and cosmic variance ef- 
fects, hence improving the quality of the measurements. 

As the velocity field presents significant fluctuations on 
a larger scale than for the density field, one may worry 
about the small size of the simulation volume. We have 
checked, using linear theory, that the velocity dispersion in 
200^/i-^ Mpc^ for our cosmology, is 40 km s ^ . This value 



^ The computationally high cost of the reconstruction consider- 
ably limits the number of possible realisations. 
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Figure 1. Sheth & Tormen mass function / Diffuse mass - 
The top panel of this plot gives the number density of haloes 
in a mass bin as a function of. the mass. The round points give 
the measurement of this function in the halo catalog whereas the 
dashed line is obtained using the S heth Tormen| ( 2002 ) theory. 
The residuals between the prediction and the measurement are 
given in the lower panel (relative differences). Most of the time, 
the points are within a few percent of the theoretical prediction. 



has to be compared to the typical errors appearing while do- 
ing velocity reconstructions to ensure that cosmic variance 
effects are negligible for our purpose. 



1.2 The basic mock catalogues 

To build mock catalogues, we have selected haloes from 
the A^-body experiment using a standard Friend-Of-Friend 
algorithm with a traditional value of the linking parame- 
ter given by / = 0.2 ( jEfstathiou et al.||l988| ). Haloes with 
less than 5 particles, i.e. with mass smaller than Mmin = 
1.62 X 10^^ M0, were discarded. Fig. [ijshows the good 
agreement between the measured halo mass function and the 
Sheth Tormen] ( |2002| model for haloes with M > M^in- 



However about 63% of the mass is not clumped in these 
haloes and is distributed in the background field. In realistic 
galaxy samples such as the NBG-8k or the 2MASS cata- 
logue the lower mass cut-off is of the order of 10^^ M©, a 
value much smaller than our Mmin- To mimic galaxies with 
mass smaller than Mmin, as will be required in the following, 
we just use dark matter particles unassigned to any halo as 
tracers. The catalogue containing all the haloes and all the 
field particles will be called FullMock. One could here worry 
that the A/'-body sample that we are using has a too low 
resolution as the spatial distribution of small halos is biased 
but not the particles of the background field. We have actu- 
ally checked that using a 512"^ A/'-body sample with nearly 
the same cosmology [the simulation is described in ^Colombi] 
et aL l(2007)] does not change any measurements presented 

Out of FullMock^ we have extracted three spherical 
cuts of radius 40 /i~"^Mpc (hereafter denoted by 4k-mockX), 
where the velocity- velocity comparisons are conducted, and 
twice deeper counterparts (hereafter denoted by 8k-mockX) 



are used to give better constraints (§ 5.2) on the reconstruc- 
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tion within the volume of analyses. Each of these catalogues 
is centered in a different place in the simulation such that: 

- 4k-mock6 is mildly overdense, with an effective mean 
matter density f^eff = 0.35, and contains 495 haloes. It is 
designed in such a way that large voids and large concentra- 
tions of matter (clusters or super-clusters) are present near 
its boundaries, similarly as found in real redshift catalogues 



least-square fit of the function Pde corresponding to the 
128^ reconstruction respectively with a Gaussian fit, and a 
Lorentzian fit given by 



of our local neighbourhood, such as the UZC (Falco et al. 
[1999^, the NBG-3k (Shay a et al.|T995l|Tully et al.|20Q7| ) and 
the NBG-8k. This catalogue and its deeper counterpart, 8k- 
mock6, are particularly suited to address edge effects on the 
NBG-3k (which terminates at Hydra and Centaurus clus- 
ters) and the NBG-8k (which stops at the Great Wall), re- 
spectively. 

- 4k-mock7\s highly overdense, with l^eff = 0.50, and con- 
tains 656 haloes. Very little mass has come in and out of 
this volume: it behaves somewhat like an isolated universe, 
with small external tides. 

- 4k-mockl2 is underdense, with QeS = 0.19, and con- 
tains 213 haloes. It presents as well a low level of density 
fluctuations along its boundary. 

While there is no ambiguity in setting up a 128"^ MAK 
mesh when using all the haloes and the background parti- 
cles (such as in FullMock), it is less trivial to consider lower 
resolution meshes that will be used in some of the subse- 
quent analyses. Indeed, the number of mesh elements as- 
signed to each tracer is not necessarily an integer anymore. 
Appendix [a] details the general procedure used to associate 
elements of the MAK mesh to each tracer. 



2 ERRORS IN MAK VELOCITIES 

Before going over observational issues, we address errors in- 
trinsic to MAK reconstruction. First, there is scatter in the 
reconstruction of the displacement field itself which is ex- 
pected to be rather small (Mohayaee et al. [2006 ). Second, 
there is scatter due to the Zel'dovich approximation one uses 
to convert a displacement field into a velocity field and to 
deal with redshift distortions. An accurate knowledge of the 
distribution of errors on the reconstructed velocities is even- 
tually required for the likelihood analysis we want to intro- 
duce in § |6.2| In this section, we measure such a distribution 
in real space while redshift space will be addressed in § |4] 
In principle, the width of such a distribution is expected to 
increase when observational biases are taken into account 
while its shape should not change significantly. 

We consider, in this section, reconstructions based on 
the catalog FullMock, for which periodic boundary condi- 
tions are applied to avoid edge effect problems. We also 
assume that we know the mass of all of described catalog 
objects (haloes and individual particles). Our subsequent 
reconstructions have a resolution within 64^ and 128^ mesh 
elements. We will thus present two reconstructions obtained 
on two different initial MAK mesh, 128^ and 64^, obtained 



using the procedure presented in Appendix [A] The results 
on the reconstructed displacement field are given in Fig. |3] 
These plots give the distribution of differences, Pde, between 
the line of sight component of the reconstructed displace- 
ment field and the "exact" one, given by the simulation. 
The dot-dashed and dashed curves correspond to a 



PLor(x) 



1 



TlB 1 + 



(4) 



Examination of Fig. [3] supports the Lorentzian approxima- 
tion with B = 35 km s~^, which reproduces better the long 
tails of Pde than the Gaussian. 

The width, P, of Pde is rather small compared to the 
line-of-sight dispersion, ~ 292 km s~^, as ex- 

pected. Naturally, the function Pde is slightly flatter and 
larger for the 64^ case than for the 128^ one. However, the 
far end tails of Pde are the same for 64^ and 128^. In this 
regime, the measurements are not influenced by the resolu- 
tion of the grid used to perform the reconstruction but rather 
by the inability of MAK to reproduce the internal dynamics 
of massive, relaxed objects (Mohayaee et al.]|2006). 



Fig.|4]is similar to Fig. 3] but considers line of sight re- 
constructed velocities vs "exact" ones. Although Zel'dovich 
approximation introduces extra noise as shown by a wider 
width of the distribution, Pde remains roughly Lorentzian 
with a small width P = 48 km s~^. This error variance is 
grossly 25% higher than the expected velocity fleld variance 
on the simulation volume (§ |1-1[ )- We are thus not affected 
by cosmic variance effects that could have been induced by 
modes larger than the box size of the simulation. 

These results are fully supported by the examination of 
Fig. [2] However, the lower panels of this flgure shows that 
the joint distribution P('ysim, '^^rec) presents non-trivial tails 
above the diagonal line in the lower left quadrant and below 
the diagonal line in the upper right quadrant, respectively. 
These tails do not disappear even after smoothing of the 
velocity field with a 5 h~^Mpc Gaussian window. This is 
due to non-linear features in the dynamics not taken into 
account by our MAK+Zel'dovich prescription, which pro- 
duces a slightly smoother velocity field than the real one. 
As a result, upper left panel of Fig. [2] which corresponds to 
the reconstruction, is less contrasted than the upper right 
one, which corresponds to the simulation. 

These non-linear tails give a propeller shape to 
P('^sim, t'rec) which is susccptiblc to inducing a small bias 
on the final velocity-velocity comparison. For instance, one 
can estimate the slope of the lower left scatter plot of Fig. |2] 
using the ratio s — (Jv,rec/ o-v,sim^ where (Jv,rec and (jl^sim 
are the variances of the reconstructed and simulated veloc- 
ity fields, respectively. In this case, the estimated /3 is biased 
to higher values by about 7%. However, visually inspecting 
the scatter shows no measurement bias should occur if only 
the central part of the scatter is used for the computation. 
To achieve this, we have first applied an adaptive SPH filter 
on the scatter plot to produce a Probability Density Func- 
tion (PDF), which is probed by the scatter in the points, on 
a regular mesh grid. We then compute the 1.5cr isocontour 
which encloses the region where the integrated PDF is equal 
to 68%. This procedure has already been used in [Colonibi] 
et al. (2007) for the gravity- velocity comparison with to- 
tal success. Only the points enclosed by the 1.5cr isocontour 
are used to compute the new Smed,68 coefficient. The [3 pa- 
rameter deduced from Smed,68 is now statistically unbiased. 
Similarly, we define two other slope estimators Smin,68 and 
Smax,68 whosc relcvancc is discussed in Appendix [51 In this 
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Figure 2. Velocity field reconstruction on FullMock - Top panels: A slice of the line-of-sight component of the simulated velocity field, 
'i^r,sim5 and the reconstructed one, 'Ur,rec, after smoothing with a 5 h~^Mpc Gaussian window. The observer is at the center of this slice. 
Bottom panels: Scatter plots between 

^r,sim cind 'L'r,rec for invidual halocs (left) and after smoothing (right). 



paper, until § |6] we will only discuss the measurement of 
r^m obtained through the estimation of Smed,68- The Qm ob- 
tained by this method is identified by a "1.5cr" to make a 
difference with the one obtained through the likelihood anal- 
ysis that will be established in § |6] and which is identified 
by a in the tables and figures. A test of this method 
on a simulated scatter distribution, whose shape is built on 
analysis of reconstruction errors, is detailed in Appendix [P] 



3 MASS-TO-LIGHT ASSIGNMENT 

Most reconstruction methods, including ours, infer the total 
matter distribution as a function of the visible matter dis- 
tribution traced by galaxies. The fundamental assumption 
one usually makes is that the relation between these two 
distributions is highly deterministic. In other words, one as- 
signs to each galaxy of a given luminosity L a dark matter 
concentration (a halo) of mass M — f{L). However, there 
are several issues in this procedure: 

- Mass-to-light ratio - The choice of a function f{L) influ- 
ences considerably the results and is expected to introduce 
significant bias on the measured P if performed unwisely. 
Now, the function f{L) is coarsely determined (Tull y||2005 
Marinoni Hudson|2 0Q2 ) from direct measurements in ob- 
servations. One way to infer this function is to rely on semi- 
analytic models of galaxy formation, but this represents a 
very strong prior on the measurements. Furthermore, f{L) 
remains a mean relation around which there can be some sig- 




Error on rescaled displacement field (km/s) 



Figure 3. Error in reconstructed displacements - This 
plot displays the probability distribution of the quantity 
P (^r,rec — ^r,sim) measured in FullMock (solid curve), where 
^r,rec and ^r,sim a^^e the line-of-sight component of the recon- 
structed and simulated displacement fields, respectively, after 
choosing an observer at the center of the simulation box. The 
dashed and dot-dashed curves give the best fit of a Gaussian and 
a Lorentzian distribution, respectively. 



nificant scatter. This dispersion can as well introduce some 
significant biases. 

- Missing tracers / Magnitude limitation - Even if func- 
tion f{L) is perfectly known, fainter galaxies are still miss- 
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^-500 500 



Velocity error (km/s) 

Figure 4. Error in reconstructed velocities - Same as in Fig. [3] 
but the solid curve corresponds to the probability distribution of 
the quantity 

"^r-rec — sim? where v-r rec and v-^ sim ciTC the line- 
of-sight reconstructed and simulated velocities, respectively. 




ing in the catalogues due to the limitations of observational 
instruments. For instance, in magnitude-limited catalogues, 
the number density of detected galaxies decreases with dis- 
tance from the observer. These missing tracers have un- 
known positions and correspond to a part of the dark matter 
distribution which is totally undefined. This missing mass 
has to be taken into account in some way. 

In what follows, we will first address the second issue in 
a very simple way which assumes that the function f{L) is 
well known (namely the masses of dark matter haloes them- 
selves) but there is a fixed low-mass cut-off. The problem 
then consists in determining the unknown part of the dark 
matter distribution (namely the particles unassigned to any 
halo). Clearly it is correlated with the detected mass tracers 
but less clustered. There are two extreme ways to locate this 
missing mass 

(a) associate it with the existing tracers as usually done 
with the analysis of real observations 

(b) associate it with a uniform background. 

Of course, the real solution is somewhat intermediate be- 
tween (a) and (b) as will be shown in § 3.1 

Then, we turn in ^ 13.21 to the issue of the choice of 



/(L). In this paper, we prefer to be as free as possible from 
strong priors so we deliberately do not use results from semi- 
analytic models of galaxy formation. Instead, we use de- 
terminations of f{L) from observational data but, unfortu- 
nately, there are large uncertainties in these measurements. 
The point here is to quantify, quite heuristically though, 
the effect of these uncertainties, random or systematic, on 
the measurement of p. Indeed, one is both confronted with 
a possibility of a wrong approximation of f{L) and most 
probably a large scatter around this mean relation. 

In sufficiently deep galaxy catalogues, the effect of the 
missing tracers is expected to be negligible close to the ob- 
server and, in general, to increase with the distance from 
the observer. With appropriate weighting of the data, one 
can minimize the bias brought by the procedure used to in- 
fer the missing mass distribution far from the observer. In 
§ |3.3| we shall illustrate this point by considering the case of 



Figure 6. Diffuse mass - In this plot, we represent the frac- 
tion of the clustered mass below two mass resolutions for a stan- 
dard ACDM type cosmology {h = 0.65, ag = 0.99). We used a 
power spectrum as proposed by Bardeen et al.| ( |1986| >. The cur- 
vature of the Universe is kept flat while flni varies. This fraction 
is plotted for mass resolutions: 2.5 x 10-*^^ Mq (corresponding to 
the lower mass limit of haloes in our simulation) and lO-*^-*^ Mq 
(~ 10^ Lq^b)- The unclustered fraction in FullMock is given by 
the back fllled circle. The fraction of mass below both of these 
limits is still considerable. 



a magnitude-limited catalogue where all the missing mass is 
associated with the existing tracers [method (a) above] . 



3.1 Missing tracers 

Fig. [6] shows the expected fraction of the total mass below 
a fixed threshold as a function of Qm , using the |Sheth <£] 
Tormen| ( |2002 ) model (see also Fig.[T]). The solid line corre- 
sponds to the mass cut-off of haloes in FullMock and agrees, 
as expected, with the measurement in the simulation for 
r^m = 0.30. Here, 63% of the mass is outside of the haloes, 
which represent our "galaxies" with known M/L ratio. The 
particles not linked to the haloes represent the missing mass. 
In Fig. |2] their exact location was used to perform the re- 
construction. The only information available now is the dis- 
tribution of "visible galaxies". The missing mass needs to 
be redistributed using only these pieces of information. We 
propose two extreme ways to do so: 



I. All missing mass to background - Prior to the recon- 
struction, the missing mass is divided into particles which 
are randomly put in the catalog following a poissonian dis- 
tribution. In the example illustrated by the right panels of 
Fig.|5]we choose for simplicity particles of the same mass as 
those in the simulation. 

II. All missing mass in haloes - The missing mass is at- 
tributed to the existing haloes in proportion to their masses, 
as illustrated by left panels of Fig.js] This approach is equiv- 
alent, in real observations, to multiplying the M/L ratio of 
galaxies or group of galaxies by a constant a > 1. 
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Figure 5. Diffuse mass correction - The top panel gives a slice of the line-of-sight component of the simulated velocity field, after 
smoothing with a 5 h~^Mpc Gaussian window. The observer has been put at the center of this slice. The second row of panels represents 
the line-of-sight component of the reconstructed velocity field, smoothed in the same way, for different corrections of the diffuse mass. 
The third row of panels give the scatter distribution of individual reconstructed velocities of haloes vs simulated ones. The left panels 
give the result of a reconstruction on a mock catalog which only contain the haloes and not the background field but at the same time 
conserves the total mass of the catalog by reassigning the missing mass to the haloes. The right panels give the result for a reconstruction 
based on a mock catalogue for which the missing diffuse mass is represented by a background field composed of particles placed randomly 
in the catalogue. The center panels give the result of a reconstruction on a mock catalogue which only contain the haloes and a random 
background field. The mass that have been initially removed from the mock catalogue (the background "galaxies") is reassigned as 
follows: 60% to haloes and 40% to the background. 



Obviously, in I, the screening effect due to the back- 
ground is exagerated, hence the reconstructed velocity is 
less contrasted and f3 is over-estimated to compensate for 
this. In II, on the other hand, the potential wells are more 
contrasted than they should be, which leads to the opposite 
effect. At this point, it is extremely tempting to try to find a 



simple compromise between I and II as illustrated by middle 
panels of Fig. [5] where 60% of the missing mass was linked 
to the tracers and the remaining to a uniform background. 
With this particular choice of the redistribution, the match 
between the reconstructed and the simulated velocity fields 
is spectacular. This result is non-trivial given the simplic- 
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Press- Schechter mass function that reads as follows 
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Figure 7. M/L assignment - Sketch of the procedures used to 
test the influence of a choice of a M/L assignment, as explained 
in the main text. 



Fig.lSlis of 
Fig. whe 



ity of the handling of this sixty three percent missing mass 
all the more since the scatter on the middle-lower panel of 
of the same order of that of the lower left panel of 
where all the tracers contribute optimally. 
Although the choice of the optimal redistribution re- 
mains a priori unknown in a real galaxy catalogue one can 
at least infer error bars from I and IL In that framework, 
Fig. [5] unfortunately provides quite a bad constraint on 
/3, 0.36 < P < 0.85. However, in real galaxy catalogues, 
such as the NBG-3k or the NBG-8k, the minimum lumi- 
nosity is of the order of 10^ L©. This corresponds to a less 
abrupt mass cut-off, Mcut ^ 10^^ M©, than in Fig.jH] where 
Mcut = 2.5 X 10^^ M0. Therefore, one expects the problem 
of missing mass to be less saliant in real observations, as 
illustrated by the dashed curve of Fig. |6] Furthermore, an 
appropriate use of mock catalogues can help at calibrating 
the redistribution of mass, as performed in middle panels of 
Fig.H 



3.2 Mass-to-light ratio 

To test how the choice of mass assignment to galaxies or 
group of galaxies affects the results we consider the three 
following cases, as summarized in Fig. [7| 

(i) T-C case: a galaxy catalogue is extracted from Full- 
Mock by associating a luminosity L(M) to each dark matter 
halo or background particle using Tully's latest best fit of 
the group mass- luminosity relation ( Tully 1 20*05 see Fig.js]) 



2700 



M 



-6x10^^ Mq/M 



(5) 



which gives the luminosity in the B band for groups in the 
mass range 10^^ Mq - 10^^ M©. Then a new mass is given 
to each tracer assuming 



M/L — constant. 



(6) 



as often used in the litterature, and MAK reconstruction is 
performed on a resampling of this mass distribution. 

(ii) T-MH case: a less extreme case than assuming 
M/L — constant consists in separating the tracers in 
three broad classes: faint galaxies, luminous galaxies and 



group/clusters of galaxies, as performed by Marinoni & Hud- 
(2002), hereafter MH. To do this, they used a simple 



M/L = 1.15 10^ (^) 
M/L = 128/1^ 

M/L = 3.6 10-^(4)' 



M0 



.^0 



^0 

4 10^° < 7^ < 4 10^^ 
^ > 4 10^^ 



< 4 10^ 



(7) 



mapping between the Schechter luminosity function and the 



as shown in upper panel of Fig.js] In this framework, we gen- 
erated the same catalog as in T-C case but it was analyzed 
assuming the M/L function given by Eq. ([7|. 

(iii) TS-T case: assuming that we have an unbiased es- 
timator of the M/L function, there can still be a scatter 
around this mean value that can increase the errors and 
also introduce systematic bias. We test this by multiplying 
the mass of each halo of FullMock by a random number x 
such that log^^Q x is uniformly distributed in [—1,1], prior to 
MAK reconstruction, which is performed on a resampling of 
the halo catalog following the procedure explained in Ap- 
pendix [a] Note that the mass of background particles re- 
mains unchanged during the process, which corresponds to 
63% of the matter distribution being unaffected by the scat- 
tering. However, applying the scatter to small mass haloes 
only introduces a local additional noise which should not 
have any significant consequences on the reconstruction ac- 
curacy for which deeper potential wells are in fact more crit- 
ical. 

We want to highlight the fact that each of these trans- 
formations, actually corresponding to transforming the mass 
of an object of FullMock through a M ^ L ^ M opera- 
tion, does not correspond to an identity. One actually gets a 
new set of masses attached to each tracer which is different 
from the original one. Moreover, the output mass distribu- 
tion Pmass,out(M) may be fundamentally different from the 
input one Pmass,in(M). Indeed, computing Pmass,out(M) is 
equivalent to performing a weighted average of Pmass,in(M). 
This procedure induces a global reshaping of the distribu- 
tion. Consequently, the statistical properties of the corre- 
sponding mass density field may be affected. 

More technically, during the procedure used to con- 
struct all the catalogues above, total mass conservation is en- 
forced. Note that the total mass depends on Qrnh^, but this 
normalization does not affect MAK displacements, which 
are sensitive to density contrasts only. Parameters Qm and 
h in fact intervene while performing velocity-velocity com- 
parison and while converting distances to velocities (§ |6|, 
respectively. 

As expected, random uncertainty on the mass deter- 
mination does not introduce any bias, it only increases the 
scatter in the measurements as illustrated by the lower left 
panel of Fig. |9] A more important issue is the global knowl- 
edge of the M/L relation. Indeed, it seems that the slope of 
this relation influences greatly the results, as illustrated by 
the middle and right panels of Fig. |9] Clearly, if the galax- 
ies follow the Tully formula ([s]), it is definitely wrong to 
assume constant M/L and even the MH fit introduces a 
significant bias, although it is well within the observational 
errors compared to Eq. ^ . It must be noted that this bias 
can be turned into an advantage if one does not want to 
measure Qm but the M/L relation. Indeed, WMAP experi- 
ment ( Bennett et al. 12003) [Spergel et al.|2006 ) coupled with 
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Figure 9. M/L bias - The top panel gives the expected hne-of-sight component Vr of the velocity field, smoothed with a 5 h~^Mpc 
Gaussian filter, as given by the simulation in a thin slice of the simulation containing the observer. The middle panels gives the 
reconstructed Vr field, with the same smoothing, after having applied each of the transformations specified in Fig. [T] to FullMock. The 
lower panels gives the scatter between the reconstructed and simulated peculiar velocities for each of the transformations. 



Table 1. M/L bias effect - This table gives the results obtained using different statistical tools. We also measured Q^rn using six different 
methods: the label s means we used the slope estimated by using all objects, the label £ is used when Vlrn has been determined using 
the likelihood analysis, and the label 1.5cr is used when the slope is estimated using only the objects within the 1.5cr isocontour of the 
PDF between reconstructed velocities and simulated velocities (method described in §|2l. 



Transf. 


s 


Velocity 
r 


(7 




('^min) 


('^max) 


(1.5cr,Snied) (l-5cr,Smin) 


(l.5(7,Sniax 


None 


0.88 


0.89 


0.58 


0.38 


0.30 


0.31 


0.30 


0.28 


0.31 


TS-T 


0.90 


0.78 


0.64 


0.36 


0.26 


0.30 


0.28 


0.24 


0.33 


T-MH 


0.80 


0.80 


0.60 


0.45 


0.33 


0.38 


0.36 


0.32 


0.40 


T-C 


0.71 


0.78 


0.63 


0.55 


0.40 


0.48 


0.44 


0.37 


0.51 



an analysis of the power spectrum of large scale density of 
galaxies ( Tegmark et al.|2006 ) gives good constraints on the 
real Qm now. Our method, on the other hand, is able to 
measure the discrepancy between the measured [3 and the 
expected growth factor /^expected = {i-e. the bias). This 
measurement may give an idea of how wrong is the assumed 
M/L relation prior to the reconstruction and may push us 
to try different plausible M/L functions. Thus our method 
is able to measure the way that the matter is distributed in 
the Universe once it is given its average density Qm- On the 
other hand, if the above bias is well understood, this method 



helps at reducing the degeneracy in the determination of cos- 
mological parameters. Indeed, our posterior probability on 
(r^m,^) gives a constraint orthogonal (for example see the 
results in [Mohayaee Tully|2005t to the one obtained from 
the WMAP experiment and from the galaxy statistics of the 
SDSS. 



3.3 Magnitude limitation 

Magnitude-limited sampling of mass tracers introduces a 
new type of problem: flux limitation decreases the mass res- 
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Figure 8. M/L function - The above two plots give the for- 
ward and inverse mass- t o-ligh t functions for both |Tully| \2005\ 
and |Marinoni Hudson|pQQ2| ) fits. The top panel gives the M/L 
as a function of the luminosity L, the bottom panel gives L/M 
as a function of the mass M. 



olution toward the outer edges of the catalogue contrary to 
the homogeneous case studied in § |3.1| Usually, the incom- 
pleteness is handled by boosting uniformly the luminosities 



of galaxies at a given distance from the observer ( |Bran- 



chini et al. 



2002j), prior to conversion of luminosities into 
masses. This is a fair approach if M/L = constant, mod- 
ulo the issues discussed in S 13.11 However, this method is 



in general questionable for non-trivial M/L relations as in 
Eq. ^ or if different M/L's are assigned to galaxies with 
different types. In these last two cases, the missing mass 
correction should be applied to the mass distribution itself 
instead of the luminosity one, to avoid systematic errors on 
mass assignments, hence on reconstructed velocities. This 
unfortunately requires a prior assumption on the value of 
r^m, but only slightly complicates the analyses. 

In the observational data, galaxi es are sepa rated into 
two populations: group^of galaxies ( |Tully||l987t and field 
galaxies. These two populations should be treated sepa- 
rately, keeping in mind that the groups are the most critical 
because their gravitational influence is much larger than in- 
dividual field galaxies and they have better peculiar velocity 
measurements. 

The full procedure consisting of creating a magnitude- 
limited mock catalogue and recovering the mass distribution 
is detailed in Appendix |C] Let us recall that, in our mock 
catalogues, groups of galaxies are simulated dark matter 
haloes with more than 5 particles while background galax- 



Groups are defined here as compact sets of 5 galaxies or more. 
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Figure 10. Magnitude limitation - Solid line: NBG-8k pre- 
dicted luminosity incompleteness at the given distance from the 
observer. Dashed line: Simulated luminosity incompleteness in 
8k-mock6. The incompleteness is expressed in terms of missing 
luminosity fraction at the specified distance. 



ies are identified with dark matter particles unassigned to 
any halo. We list here the key steps we used to correct for 
incompleteness: 

I. The total apparent luminosities of groups of galaxies is 
obtained assuming a global or a local Schechter luminosity 
distribution for the considered groups. The intrinsic lumi- 
nosity is computed trivially from the total apparent lumi- 
nosity and the redshift of the group. 

II. The intrinsic luminosity of the remaining unbound 
galaxies (thus field galaxies) is also determined, straight- 
forwardly. 

III. Then, masses are estimated by assigning appropriate 
M/L to each object of I and II. 

IV. The local missing mass from undetected background 
galaxies is inferred from the detected mass distribution. This 
requires a prior on Qm- 

V. This missing mass may either be reassigned locally to 
detected field galaxies of II (our choice) or be introduced by 
the mean of new randomly positioned tracers, as discussed 
in §1311 



To examine the effects of systematics in the correction 
for incompleteness, we use 8k-mock6 and choose a flux limit 
such that the resulting mock catalogue has an incomplete- 
ness similar to NBG-8k, as shown in Fig. |10| Results are 
summarized in Fig. [TT] and in Table |2] 

The reconstructed radial peculiar velocities v^^vec are be- 
having extremely well. On average, the comparison between 
simulated and reconstructed velocity fields is surprisingly 
good in a volume of radius 80 /i~^Mpc, even though the 
edge misses locally 98% of the field galaxies which repre- 
sents 60% of the total mass in our mock catalogue. It means 
that, though we keep only 2% of the field galaxies, they suf- 
fice, in addition to the groups, for a reasonably fair recovery 
of the large-scale peculiar velocity field. Note the small bias 
in the scatter of the lower right panel of Fig. [TT] resulting 
in a slightly larger Qm = 0.38 than the expected value of 
0.30, but in good agreement with the effective value of 0.35 
expected in the corresponding volume (see § 5.3 on cosmic 
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variance effects). This bias might be the consequence of our 
treatment of the missing mass coming from undetected trac- 
ers as discussed, in detail in Appendix [C] (point B). 



4 REDSHIFT DISTORTION 

The input of MAK reconstruction is the position of objects 
in real space as needed by Eq. ([2]). However redshift cat- 
alogues give us galaxy positions in redshift space, namely 
Sr = Hd + Vr, where Sr is the redshift distance, d is the lu- 
minosity distance between the observer and the object and 
Vr is the line-of-sight peculiar velocity. To account for red- 
shift distortions, we must correct for two major effects: 

- "Fingers-of-god" correspond to an elongation of dense 
structures along the line of sight, such as clusters of galaxies, 
due to random motions of galaxies within these structures. 

- Kaiser effect (Kaiser 1987) is a large-scale effect com- 
ing from the coherent part of the cosmic flows, which, for 
instance, increase the overall density contrast. 

Fingers-of-god effects can be easily removed by simply 
collapsing groups or clusters to a single point, as usually 
performed in the literature. However, such a procedure is 
generally carried out in a rather ad-hoc way and is certainly 
not free of biases. 

The Kaiser effect can be accounted for by modifying the 
cost function ([2| using the Zel'dovich approximation to in- 
fer line-of-sight peculiar velocities as functions of the sought 
displacement field (Mohayaee Tully|2005| Valentine et al. 
2000). If s(q) is the redshift coordinate of a particle origi- 
nally at q then the total cost ([2]) of the association a be- 
comes: 



/3(2 + /?) ((s,-q^(,)) -s)' 



(8) 



where [3 is the linear growth factor. Once the redshift dis- 
placement = s — q has been computed, the reconstructed 
radial peculiar velocity of the object i can be obtained by 



l + f3 



(9) 



The cost function 1^ leads to the exact result in the case 
of a Zel'dovich displacement field without shell crossing af- 
ter redshift distortion. However, in general, the second term 
(accounting for redshift distortion) of Eq. ([s]) becomes of 
the same order as the first term (the real space cost term) 
near the origin. In this case, the reconstruction becomes ill- 
defined because of the loss of convexity of functional I a. 
We expect thus the central part of all catalogues to be, in 
general, poorly reconstructed. The size of such a region is 
roughly determined by the magnitude -^obs of the large-scale 
flow nearby the observer with respect to the Cosmic Mi- 
crowave Background. The velocity Vohs determines the rela- 
tive contribution of the first term with respect to the second 
term of Eq. ([s]). In practice Vohs is of the order of a few 
hundred km s~^(for instance the Local Group velocity is 



630 km s' MErdogdu et al.|2006] which gives us a region of 
"exclusion" of radius of about a few /i-^MpcQ 

Again, MAK reconstruction fails in regions where shell 
crossings occur. Projection in redshift space generates such 
shell crossings along the line-of-sight. These shell crossings 
are dramatic because of their anisotropic nature. In par- 
ticular, filaments can cross each other while passing from 
real to redshift space, implying the reconstruction will fail 
in a large region of the catalogue encompassing the gravita- 
tional influence of these filaments. In this area, most of the 
reconstructed radial velocities will have the opposite sign 
compared to the true velocity. Of course, shell crossings in 
redshift space can have more complex consequences but this 
simple example suggests that MAK reconstruction should 
not work as well in redshift space as in real space 

Another problem of this method is that one must as- 
sume [3 prior to the reconstruction. As for § |3.3| where we 
had to guess the undetected mass, we choose a value nm,in, 
thus an assumed /3in , then we make a redshift reconstruction 
and measure a nm,out- In practice, the "true" Qm of the cat- 
alogue was chosen to be the one for which r2m,in = ^m,out, 
which corresponds to having self-consistent orbits model- 
ing when doing MAK reconstruction and when one makes a 
comparison with measured velocities. 

Fig. shows both reconstructed and simulated velocity 
fields and the scatter between rec cind '?^r,sim- The first im- 
pression when comparing the two top panels of Fig.[T2]is that 
the redshift reconstruction behaves really well. However, 
some potentially worrying localized features are present: 

- Some important structures have their velocities badly 
reconstructed. Two important examples are the green- 
yellowish finger just above the center of the upper right 
panel of Fig. ^] and the big velocity peak at the top of this 
same panel. In the left panel, these two structures are not 
so prominent. The difference can be understood by studying 
the impact of the Kaiser effect on the reconstructed velocity 
field. Basically, two nearby filaments can merge in redshift 
space and give birth to a filament with a higher apparent 
density. The reconstruction is not able to separate these two 
filaments, which leads to an area with higher reconstructed 
velocities than the true ones. Thus, we expect in observa- 
tional data to meet problems in the neighbourhood of the 
Great Wall, which is a supercluster of filaments compressed 
by redshift distortion. 

- The velocity field in the immediate (5-10 /i~^Mpc) 
neighbourhood of the mock observer has lost its spatial 
structure and even presents a spurious peak. This is, most 
unfortunately, an expected problem that is linked to the 
above discussion on the problems of 1^ near the observer. 
Indeed, in the neighbourhood of the observer, la becomes 
singular and the reconstruction misses, most likely, the right 
orbits. Analysing the smoothed velocity field seems to show 
that this effect looks in practice much like the one just above: 
the reconstructed velocity field may be boosted by the merg- 
ing of different structures in the neighbourhood of the ob- 
server. 
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^ See e.g. 
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Figure 11. Incompleteness: magnitude limitation - Top panels: A slice of the line-of-sight component of the simulated velocity field in 
8k-mock6 and the reconstructed one, after smoothing with a 5 h~^Mpc Gaussian window. The displayed slice is chosen to include the 
observer in (0, 0). The white circle in the right panel gives the size of the 40 h~^Mpc sphere embedded in the 80 h~^Mpc one. Bottom 
panels: The scatter plots compare the reconstructed and simulated velocities of objects in the 80 h~^Mpc region (left panel) and in the 
40 h~^Mpc volume (right panel). 



Table 2. Incompleteness: magnitude limitation - Column description is given in the caption of Table ^ 



Volume 


Velocity field 












s r 


a 


('^min) 


('^^max) 


(1.5cr,5nied) (l-5cr,Smin) 


(1.5cr,Smax) 


8k 


0.86 0.77 


0.64 


0.39 


0.26 


0.31 


0.29 0.25 


0.34 


4k 


0.77 0.75 


0.66 


0.48 


0.37 


0.45 


0.38 0.30 


0.47 



Table 3. Redshift reconstruction — Column description is given in the caption of Table ^ 



s r a Qm (s) .5^™ x 

V'^min j 


('Cmax) 


(l.5o-,Smed) (l-5o-,Sniin) 


(1.5cr,Smax) 


0.83 0.46 0.95 0.50 0.22 


0.29 


0.27 0.22 


0.33 
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Figure 12. Redshift distortion correction - Top panels: A slice of the smoothed velocity field 

sim and 'L'r.rec is shown in the left 

and right panels, respectively. The two fields have been smoothed with a 5 h~^Mpc Gaussian window, with objects put at their real 
(simulated and reconstructed) comoving coordinates. Bottom panels: Scatter plots between Vr and Vr,rec for individual mass tracers. The 
left panel (right panel) was produced using a real space reconstruction (redshift space reconstruction, respectively). In both cases, only 
objects within a sphere of 8000 km s~-'^are shown. 



- The lower right panel presents two additional off- 
diagonal tails compared to lower left panel. As discussed 
earlier, these tails are due to shell-crossings occuring along 
the lines-of-sight when passing from real to redshift space. 
These extra shell-crossings result in some reconstructed ve- 
locities acquiring a sign opposite to the true velocities. 

Similarly as in § [2] we have computed in Fig. [13] the 
distribution of differences PyE between Vr,rec and ^;r,sim5 for 
a redshift reconstruction applied on 8k-mock6 based on a 64^ 
meshjj Though the distribution is of course wider than in 
Fig. |4] the previously drawn conclusions are still valid. PyE 
is better fitted by a Lorentzian distribution with P = 86 km 
s~^than by a Gaussian of width a = 91 km s~^, particularly 
in the tails. 

To check the effects of redshift distortion on the quality 
of the reconstruction, one can compare Table [3] to the first 
row of Table [l] As usual, the s parameter is slightly biased 
below unity due to nonlinear effects discussed in § |2] which 
seem, not surprisingly, to be slightly enhanced by redshift 
distortions. The appeareance of the off-diagonal tails in the 

^ The handling of the finiteness of the catalogue volume is han- 
dled in § [5:2] 



lower right panel of Fig. |12| increases the level of scattering, 
hence the correlation coefficient r decreases and the signal- 
to-noise ratio increases. Reducing the analysis to the region 
inside 1.5a isocontour greatly improves the results, as ex- 
pected, but still leads to a value of Qm slightly biased to 
lower values, Qm = 0.27. 



5 EFFECTS OF CATALOGUE GEOMETRY 

In practice, real galaxy catalogues are not spatially periodic 
as is our simulation. They represent a region of finite vol- 
ume with non-trivial geometry. In particular, two kinds of 
problems arise: 

- Edge effects - Reconstruction of the galaxy trajectories 
without any piece of information on what may affect them 
dynamically from the outer parts of the catalogue is likely to 
introduce significant sources of errors, possibly systematic. 
We separate here edge effects into two subclasses: the effects 
of the obscuration by our galaxy, which defines a Zone of 
Avoidance (hereafter ZOA) and the effects of finite depth of 
the catalogue. These two effects need a separate treatment 
detailed in §[5T]and § [5^ 
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the copied haloes in the ZOA are divided by two and we only 
take half of the field galaxies. This method has been used 



'-500 



500 



Velocity error (km/s) 



Figure 13. Error distribution of the reconstructed velocity field, 
redshift space - Same as in Fig.^but the solid curve corresponds 
to the probability distribution of the quantity rec~'^r,sim5 where 
v^^j,^^ and 'Ur,sim are the line-of-sight reconstructed and simulated 
velocities, respectively. 



- Cosmic variance - The finite volume of the accessible 
part of the Universe might be a potentially unfair realization 
of the random process underlying the properties of the large 
scale matter distribution. We must investigate whether our 
method, including handling of edge effects, is robust to the 
recovering of the statistical properties of the whole Universe 
from observations of only a fraction of it. 



5.1 Zone of avoidance 

Dust present in the Milky Way's galactic plane highly at- 
tenuates the light, thus galaxy catalogues generally do not 
provide any data in this direction (approximately the region 
within |6| < 5 deg, where b is the galactic latitude) of the 
ZOA. This strong attenuation introduces a boundary effect, 
which has the unpleasant feature of being present at any 
distance from the observer and may thus severely affect the 
measurements. As this area is nonetheless relatively small, 
particularly at low redshift, a simple correction should be 
able to greatly remove the boundary effect in the inner re- 
gion of the catalogue. 

Simulating the effect is made easy by putting an ob- 
server at the center of the simulation volume and by remov- 
ing all mass tracers in the neighbourhood of the galactic 
plane z = 0, i.e. which have \b\ < a. This gives us FullMock- 
ZOA^ 

Though more advanced ways of filling the ZOA exists 
(e.g., 'Laha v et al.|1994| [Fontanot et al.|2003t , this latter is 
here sufficiently small to be dealt with by the following sim- 
ple algorithm. Since the statistical properties of the galaxies 
should not change across the boundaries of the ZOA, the ob- 
jects in its neighbourhood can be used to fill the zone. We 
build new mass tracers to fill the obscured area by applying a 
locally planar symmetry transformation to the galaxies and 
groups with —3a < b < —a according to the "plane" —a. We 
execute the same operation on objects with -\-a < b < -\-3a 
but according to the "plane" -\-a. In the end, the masses of 



previously to fill the zone of avoidance in NBG-3k (Shaya 
[1995 ) and NBG-8k. This folding procedure has been 



et al. 



applied to FullMockZOA, slightly moving some of the newly 
created objects to enforce the periodicity of the simulation 
box to avoid mixing the effect of the ZOA with other bound- 
ary effects. The results are presented in Fig.[l4] As expected, 
the ZOA has a clear impact on errors of the reconstructed 
velocities. 

The typical errors on the reconstructed velocities, rep- 
resented in the left panel of this figure, rise substantially in 
the vicinity of the obscured area. Fortunately, they remain 
well below the natural velocity dispersion of the simulation 
(dashed line). As we are comparing velocity fields filtered 
with a 5 /i~^Mpc Gaussian window, we expect the recon- 
structed velocity field to be nearly error free for all points 
nearer than about 60 /i~^MpcQlt is also fortunate we have 
not introduced an extra bias using the filling algorithm, as 
shown both by comparing Table [4] to the first row of Ta- 
ble [l] and looking at the scatter plot in the right panel of 
the Fig. |14| We nonetheless highlight that the edge effect is 
not at all localized near the ZOA but extends quite far away 
and becomes negligible only for \b\ > 20 deg. Table |4] shows 
that the above extra noise does not have any impact on the 
measured Qrn- 



5.2 Lagrangian domain 

The inputs to MAK reconstruction are the present coordi- 
nates of the objects, i.e. x in Eq. ([2| or s in Eq. ([8|, and 
the knowledge of the Lagrangian domain, i. e. q in Eq. Q or 
([s]). Redshift catalogues give the present "positions" of the 
objects, i.e. s in Eq. ([8|, however we have no observations 
that would give us the corresponding Lagrangian domain q. 
We are thus limited to make guesses, though in the end, for 
huge catalogues, the details of the guess does not matter as 
gravitational forces are screened on large scales by the nearly 
homogeneous distribution of matter in the universe. Conse- 
quently, what happens at the boundaries should not strongly 
affect the central part of the catalogue though some guesses 
may be better at confining the edge effects on the bound- 
aries. The naive solution is to assume that the Lagrangian 
domain is not so different from the volume of the catalogue 
itself. This assumption only begins to be a good approxi- 
mation for volume enclosed in a sphere for which radius is 
big enough. For our 80 h~^Mpc sample, the mass going in 
and out of the volume (from initial to present time) already 
represents about 16% of the total mass. For a 40 /i~^Mpc 
sphere, the mass fiow is even greater: it may vary between 
30% and 63 % of the total mass depending on the 8k-mock 
catalogue considered. Though tidal field and cosmic vari- 
ance effects becomes negligible on a 80 h~^Mpc scale, they 
still affect the boundaries of the Lagrangian domain of a 
given catalogue in a non-trivial way. As we shall show these 
problems are further enhanced by redshift distortion. 

To achieve a meaningful comparison, we have run a 
reconstruction on 8k-mock6 using the Lagrangian domain 



^ a = 5 deg in our case. 



^ This corresponds to taking a 5 deg wide ZOA and computing 
at what distance the window is smaller than the ZOA. 
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Table 4. Zone of avoidance - Noise and biasing summary. Column description is given in the caption of Table ^ 
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0.89 0.79 0.61 


0.37 


0.30 
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0.36 




Figure 14. Zone of Avoidance / velocity field - Left panel: Binned average RMS (Root Mean Square) error on the smoothed velocity 
field. As usual, the velocity field window has been smoothed with a 5 Mpc/h Gaussian window. Each point is computed by averaging 
the square of the deviation of the velocity field along the line-of-sight and for all line-of-sights having belonging to the same sin (6) bin, 
where b is the "galactic" latitude. The solid line gives the RMS error in the presence of a zone-of-avoidance at 6 = 0. The dotted line 
gives the RMS error for a reconstruction on a catalogue without ZOA. The dashed line gives the RMS of the smoothed velocity field 
itself. Right panel: Scatter plots between Vr,Tec and ^'r,sim for individual mass tracers. 



given by the simulation; this reconstruction is called True- 
Dom. Now, we confront the results of TrueDom for two 
different reconstruction setups that try to recover the La- 
grangian domain: 

- NaiveDom reconstruction is obtained by assuming a 
naive spherical Lagrangian domain for 8k-mock6. In that 
case, all the mass that is presently in the 8k-mock6 catalogue 
was uniformly in a sphere of radius 80 h~^Mpc. Equivalently, 
it means no significant mass flow must have gone through 
the comoving boundaries in the past. 

- PaddedDom reconstruction is obtained by padding ho- 
mogeneously the 8k-mock6 catalogue. The padding is chosen 
such that the final MAK mesh that will be reconstructed is 
an inhomogeneous cube (as in right panel of the second row 

and 16). The cube must be sufficiently big to ab- 



15 



of Fig. 

sorb density fluctuations present at the boundary of the cat- 
alogue (typically a 20 /i~^Mpc buffer zone is needed). With 
real data, we are bound to assume that the catalogue is to- 
tally representative of the whole universe, i.e. its effective 
mean matter density is equal to Qrn- 

Fig. |15| shows the result of a TrueDom, NaiveDom and 
PaddedDom reconstruction applied to 8k-mock6 in the ab- 
sence of redshift distortion. Fig. [l6] gives the same recon- 
structions when applied to a redshift catalogue. Table [5] 
summarises the value of the moments of P(^^r,sim, '^^r,rec) for 
different cases. We will now first confront the results of real 
space reconstructions, and second redshift space reconstruc- 
tions. 

TrueDom reconstruction does not yield any significant 
bias at 80 /i~^Mpc. However, at 40 /i~^Mpc, cosmic variance 
effects introduce a noticeable systematic error in the direc- 



tion of higher that will be discussed in § |5.3| Compared 
to TrueDom, NaiveDom gives good overall results though 
the central blue region of TrueDom turns to dark blue in 
NaiveDom, which would suggest the velocity field is biased. 
This analysis is confirmed by looking at the bottom scatter 
plot. The Qm measurement (Table [sj is underestimated by 
about 26% even in the central region of the catalogue which 
is normally less affected by boundary effects. PaddedDom, 
on the other hand, does not yield such a sharp discrepancy 
in the middle of 8k-mock6, namely in the 4k-mock6 region. 
Both the bottom scatter plot and the Qm measurement con- 
firm that the reconstructed velocities are nearly bias-free in 
the central region. As expected, the velocities in the neigh- 
bourhood of the boundaries are completely wrong for the 
two methods. 

Now, the catalogues are cut in redshift space. Redshift 
distortion biases the velocity distribution of objects on the 
catalogue boundary: the catalogue receive more infalling ob- 
jects than out falling ones. In some cases, one may even find 
objects seemingly artificially separated from the main vol- 
ume of the catalogue (they look "disconnected"). In those 
cases, the hypothesis of convexity is definitely lost for those 
objects. This problem will enhance boundary problems. The 
case of TrueDom reconstruction has been discussed in § [4] 
As previously, the peculiar velocities in NaiveDom and in 
PaddedDom are largely uncorrelated in the full 8k-mock6 



volume (Fig. 16). However, peculiar velocities reconstructed 
by NaiveDom are more strongly overestimated than by using 
PaddedDom'' s, as shown in Table [5] For NaiveDom, the scat- 
ter is plagued by a horizontal alignment in Fig. [16] mid- lower 
panels, which is a signature of a strong edge effect. This 
spurious alignment was already present, though much less 
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Figure 15. Lagrangian domain / without redshift distortions - This figure summarises the results obtained on reconstructions that 
have limited information on the Lagrangian domain. The left column illustrates the TrueDom reconstruction, The middle column the 
NaiveDom one, and the right column the PaddedDom one. For space occupation reasons, the original velocity field given by the simulation 
is not remembered but can be found in Fig. [iT] The top row illustrates the three schemes for handling boundary effects on the density 
field: in the left column one retains information of large scale tidal fields, in the middle column one cuts the catalogue spherically and 
does a reconstruction on it, in the right column one pads the spherically-cut catalogue with particles homogeneously distributed on a 
grid. The second row gives the reconstructed velocity field in each case, smoothed with a 5 h~^Mpc Gaussian window as usual. The 
color coding is the same as for the other figures, i.e. dark blue is -1000 km s~-'^and white is +1000 km s"-*^. The third row compares the 
individual (not smoothed) reconstructed and simulated velocities of objects in the 8k-mock6 catalogue. The fourth row does the same 
comparison but objects lying only in the 4000 km s"-*^ region of the 8k-mock6 catalogue. 
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Figure 16. Lagrangian domain / with redshift distortions - Same as Fig. |15[ but for mock catalogues including redshift distortion. 



apparent, in the real space case. On the other hand, Padded- 
Dom does not present this feature but only a large scatter. 
We have verified that objects belonging the horizontal align- 
ment are essentially near the 80 /i~^Mpc boundary, contrar- 
ily to velocities reconstructed using PaddedDom which are 



more or less uniformly distributed and essentially uncorre- 
lated to simulated velocities This means that PaddedDom 



This behaviour is expected from an algorithmic point of view. 
The objects nearby the boundary cannot acquire any displace- 
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is at least better at screening edge effects than NaiveDom 
in the sense the errors are more evenly distributed and less 
systematic. Though impressively low in the last two rows of 
TablejS] the correlation coefficient r is actually spoiled by the 
long tails of the PDF shown in the scatter plots in Fig. |16| 
Concerning r^m, NaiveDom seems less robust to produce an 
unbiased estimation than PaddedDom. Indeed, looking at 
Table |5] one may note that the interval delimited by Smed, 
Smin and Smax nearly does not contain Qm = 0.30 for Naive- 
Dom/Kedil space/40 /i~^Mpc, and does not contain it at all 
for NaiveD om/Redshift space. On the contrary, = 0.30 
is always selected by the three s parameters using Padded- 
Dom reconstruction. In the rest of this paper, whenever it 
is needed, we will thus use the PaddedDom reconstruction. 



5.3 Cosmic variance 

We generally assume that galaxy catalogues give a fair rep- 
resentation of the whole universe, but of course we have 
no guarantee that this assumption is correct. Thus, the re- 
sult of a MAK reconstruction may be affected by inhomo- 
geneities above the catalogue scale. For instance, our galaxy 
may reside in a particularly extreme region (overdense or un- 
derdense), which would produce unusual peculiar velocities. 
This effect, known as cosmic variance, can be investigated 
by our three original basic mock catalogues: 4k-mock6, 4k- 
mockT, 4k-mockl2 (§[!]). The cosmic variance effect is here 
further enhanced by the finiteness of the sampled volume. 
The volume is sufficiently small here to have a non-zero av- 
erage line-of-sight velocity. On a 40 /i~^Mpc scale, this effect 
can substantially modify the Qm measurement (put ^^m,mes 
in this section) by cutting the ^('i^rec, '^sim) distribution at 
an inadequate place. 

The results of the reconstruction on these three mock 
catalogues are given in Fig.[l7l In Table [6] we give, for each 
mock catalogue, the best achievable result (thus highlighting 
purely the effect of choosing this mock catalogue) and the 
results one would obtain through observation of this piece of 
the universe. Unknown Lagrangian domain, redshift distor- 
tion and incompleteness effects are added to the considered 
mock catalogue. The problems of mass-to-light assignment 
and the zone of avoidance are left apart for the sake of clar- 
ity. Their imprint on the velocities should most likely remain 
the same as we have shown in the corresponding previous 
sections, i.e. biasing for the first and increase of the scat- 
ter for the second. Only the cases with the forementioned 
observational effects are represented in Fig. [17] 

Visual inspection of lower scatter plots in Fig. [TT] shows 
that volume finiteness is likely making the r^m,mes measure- 
ment sensitive to the "local" (^eff in the table). This 
assertion is supported by the estimation of s and Qm for 



ment using MAK because of the "pressure" /competition of ob- 
jects inside the sphere. This problem is further enhanced in red- 
shift space because generally these objects come from outside the 
sphere and are selected because their infall velocity is high. In 
NaiveDom, they cannot escape from the assumed spherical La- 
grangian domain which thus leads to zeroing their velocity. On 
the other hand, PaddedDom is much less strict on the boundary, 
which leaves the freedom for MAK reconstruction to have a non- 
zero velocity even for objects on the boundary of the catalogue. 



TrueDom reconstructions given in Table [6] Moreover, ex- 
periments conducted with the spherical collapse model show 

that ^^m,mes 

is indeed a weighted average between Qeff and 

More specifically, reconstructed velocities in 4k-mock7 
(including observational effects) are apparently giving the 
Qm of the simulation but they present a large scatter ren- 
dering the slope estimation dubious. Indeed, doing the same 
reconstruction but without observational effects give a mea- 
sured nm,mes = 0.40, which is the exact average between the 
simulation nm,simu — 0.30 and Qefi = 0.50[^The aforemen- 
tioned scatter is expected for this mock catalogue: the veloc- 
ity field is badly reconstructed near the observer in that case 
(middle panels) because the local cosmic flow is higher than 
usual (^1000 km s""*^) and the non-linearities are stronger. 
Thus the convexity of the problem is lost on an extended 
region around the observer when the reconstruction is con- 
ducted in redshift space (see § [4]) . A particularly saliant mis- 
reconstruction is given by the outflowing "bubble" at the 
center which disappears in the reconstructed velocity field. 
The size of the affected region is about 20 /i~^Mpc around 
the observer in 4k-mock7 and thus limits the number of ob- 
jects having both good reconstructed and observable pecu- 
liar velocities. 

In an opposite way, velocities in 4k-mockl2 are recon- 
structed with a better correlation, as shown by Table [6] but 
Qm measurement is strongly weighted toward Qeff- These 
two "features" are largely due to the huge central void. 
First, MAK reconstruction and Zel'dovich approximation 
are known to work better in low density regions and being 
centered on a void results in inhibiting blueshift distortion 
as galaxies are principally going away from the observer, 
rendering the reconstruction problem convex in Eq. ([8|. 
Second, the low density region largely affects the statis- 
tical velocity distribution, which in this case leads to a 
measured nm,mes weighted more strongly towards the Qeff 
of 4k-mockl2rj This leads us to a Qm,mes that is nearer 
^^m,simu in 4k-mockl2 than the mean matter density of the 
whole simulation. The volume finiteness also produces an 
apparent offset between reconstructed velocities and mea- 
sured ones. This is expected as doing a statistical anal- 
ysis on a finite volume catalogue must introduce a selec- 
tion bias effect. We have indeed checked that the point set 
{{vr,i,ipr,i)}i obtained through a MAK reconstruction ap- 
plied on 4k-mockl2, is a subset of the corresponding set 
built from a reconstruction on 8k-mockl2. Looking at our 
"standard" 4k-mock6, one can note that the simulated ve- 
locity distribution is generally more symmetric according to 
the null velocity than for the two other mock catalogues, 
with no visual bias while comparing reconstructed veloc- 
ity to simulated velocity. This supports the initial assertion 
linking r^m,mes to (nm,simu, ^eff) and the asymetric distribu- 
tion of velocities. Potentially, one could recover the true Qm 
of the Universe (or here the simulation) from the measured 
velocities of any catalogues by predicting how the velocity 

Spherical collapse rather predicts r2m,mes = 0.35 for the same 
setup. 

^'^ The spherical collapse model would predict a measured 
^m,mes = 0.26 and this is in good agreement with the value 
measured when no observational effects are injected in the mock 
catalogue. 
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8k-Mock6 



8k-Mock7 



8k-Mockl2 




^ BOO 1000 -1000 ^ 

Figure 17. Cosmic variance - This figure gives a visual comparison of the three mock catalogues used to study cosmic variance effects. 
Top panels: Adaptively smoothed density fields of the considered mock catalogues. In each case, we have represented the central thin 
slice that contains the observer. Second row: Simulated velocity field, after smoothing with a 5 h~^Mpc Gaussian window. The white 
circle gives the limit of the 40 h~^Mpc volume. Third row: Same as second row, but for the reconstructed velocity field. Fourth row: 
Comparison between reconstructed and simulated peculiar velocities. 
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Table 5. Lagrangian volume - Residual error after the correction. Description for some columns is given in the caption of Table ^ 
"Radius" gives the spatial size of the sphere on which the velocity-velocity comparison is conducted. "Reconstruction type" indicates the 
type of Lagrangian domain reconstruction and whether it is mixed with redshift distortion effect. Details on the meaning of each name 
are given in § |5.2| 
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Table 6. Cosmic variance - Summary of measurements conducted on the three mock catalogues. The reconstruction is either conducted 
on the basic catalogue without any observational effect besides cosmic variance (labelled Original), or on the same catalogue but affected 
by redshift distortion, incompleteness and for which the Lagrangian domain is determined using PaddedDom (reconstruction labelled 
Full). The description of the other columns are given Table [l] 
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distribution asymmetry is linked to local density contrast. 
However, the simplest, and more robust solution, would still 
be to extend the depth of current catalogues to reach a vol- 
ume where velocities are normally distributed. 

From a prediction point of view, comparing visually 
the velocity fields inside the white circles show that, if we 
know r^m, we reconstruct plausible velocity fields for the 
three mock catalogues. Outside the white circles, the re- 
constructed velocity field is nearly completely uncorrelated 
compared to the simulated one as we have discussed in the 
previous section. It must be noted that the velocity field goes 
smoothly to zero (green colour) on the edge of all mock cat- 
alogues: this is an expected side effect of the homogeneous 
padding which tends to smooth out any fluctuation on the 
edge (velocity and density field). 



6 VELOCITY MEASUREMENT ERRORS 

6.1 The need for a likelihood analysis ? 

All the effects already described in this paper are present in 
a redshift catalogue. Though we expect most of the obser- 
vational biases should be independent, some of them may 
correlate and give worse systematic errors. We present in 
Fig. ^] the progressive deterioration of the velocity- velocity 
comparison for 4^-mock6 based on a reconstruction con- 



ducted on 8k-mock6. The effects are piled up from left to 
right. The Qm measurements for the 1.5cr method are indi- 
cated below each panel. The obvious conclusion is that the 
measurements are progressively affected but that no extra 
correlated error seems to happen when mixing the effects. 
Another fortunate event is that bias seems to counterbal- 
ance themselves to give in the end a nearly unbiased result 
(last but one panel). Going from TrueDom /HedX to Redshift 
tends to decrease Qm as has been seen previously. On the 
contrary, injecting incompleteness pushes the measurement 
to higher Qm as we have noticed in § |3.3| The 1.5cr method 
seems to give the right Qm value in all cases, which means 
that we should be able to use it on galaxy catalogues pro- 
vided we have sufficient precision on velocity measurements. 
However, looking at the last panel (bottom right) of Fig. 18 
shows that injecting random velocity measurement errors 
(here we intruduced an optimistic error of 8% of the dis- 
tance to the object, corresponding to an error on distance 
modula of cr^ = 0.17), renders slope estimation much more 
difficult. In that case, the measured f^m is severely biased. 
This is expected as the 1.5cr method relies mostly on the 
central part of the scatter, which in turn is the one that is 
the most affected by random errors. This leads to a circular- 
ization of the 1.5cr isocontour and thus a completely wrong 
estimation of the slope. On the other hand, looking at the 
global structure of the scatter shows that the right slope is 
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still hidden in the data, but one should then take into ac- 
count the tails of the distribution. This last test shows the 
limit of a direct velocity- velocity comparison in real cases. It 
might be possible to recover the original distribution of the 
scatter by deconvolving from the noise. However, it seems to 
be a difficult operation and we prefer to first try a maximum 
likelihood approach. Its main advantage would be to work 
using distances, thus rendering the error in measurements 
more tractable. 



6.2 Maximum likelihood analysis 

Observations of galaxies first give us access to their distances 
and not their peculiar velocities. A method based on dis- 
tances to make a comparison between a model and obser- 
vations is potentially less sensitive to distance measurement 
errors. Indeed, by comparing directly distances, one has a 
small relative error on each measurement instead of a huge 
one when peculiar velocities are considered. Below, we dis- 
cuss galaxy selection bias and zero-point calibration errors in 
distance measurements while keeping the notation of |Strauss| 
Willick| ( [l995t . 
Presentation of the Bayesian chain 



For the Tully- 

Fisher (TF) relation, one makes an estimate of the absolute 
magnitude of a galaxy as a function of its linewidth: the 
slope between the two quantities can be biased because the 



sample is limited in magnitude (Strauss & Willick 1995) 



This effect which is known as selection bias is purely sta- 
tistical and if not correctly taken into account can lead to 
large systematic errors. Using these absolute magnitudes, 
occasionally combined to form groups of galaxies, and the 
apparent magnitudes of the same group, one builds the dis- 
tance modulus 



/i(r) = m(r) — M = 5 log;^ 



10 pc 



(10) 



with r the distance of the considered object (group of galax- 
ies or galaxy). In addition to the forementioned statisti- 
cal bias, peculiar velocity obtained from redshift positions 
through a Lagrangian reconstruction, here MAK, are some- 
times very noisy, as shown in Fig. |18| Another more subtle 
effect is introduced by the Gaussian distribution of our ve- 
locity sample that we are going to analyze. We need to take 
care of this "selection bias" to avoid being spoiled by even- 
tual large reconstruction errors present for objects with a 
high velocity. Thus we need a Bayesian approach to account 
for all these statistical effects. 

In principle, the likelihood function gives a probability 
for the data, i.e. here redshift positions 3 = {^i}? with i run- 
ning from 1 to A^, and distance moduli 9Jt = {/Xi}, assuming 
some model described by the vector parameter p. Addition- 
ally we assume that we have an estimation of measurement 
errors on 9Jt through the set 6. The exact description of 6 
will be given in the next paragraph. Typically errors on red- 
shift measurements are of the order of 50-60 km s~^. This 
means that we can consider them as negligible if we consider 
objects farther than i?^ = 6 — 10 /i~^Mpc. The volume en- 
closed by the sphere of radius Rz is, in any case, also poorly 
reconstructed because of the singularity introduced by red- 
shift distortions near the observer (§ [4]). In the following 
analysis, we will consider redshift measurements as negligi- 
ble by avoiding the objects located at less that 10 h~^Mpc 



from the observer, thus we have: 
P(9Jt,3|p,6) (xP(9Jt|3,P,6) = £(p) 



(11) 



The end of this section is devoted to computing the right 
hand part of this equation. To achieve this, we will decom- 
pose the probability into small pieces: 



P(OT|3,6,P) = 

JJJ P{D)Pm^r,&,D, 



X P(9Jt.|V,3,p)P(V|3,p) dmrdVdD (12) 



with dJlr = • • • 5 Miv,r} representing the "true" distance 

moduli, with /Ji^R G [— oo,+oo] and V = {vi, . . . ,Vn} the 
"true" object peculiar velocities. P(9Jl|0Jtr, & ,p) is the prob- 
ability of measuring the set of distance moduli DJl given that 
the real set of distance moduli is DJlr and the expected error 
on the measurement is given by 6. P(9Jtr|V,p) is the prob- 
ability of obtaining the set of distance moduli dJlr given the 
reconstructed velocities V. P(V|3,p) is the probability the 
velocities are well reconstructed from the redshift data 3- 
The probability P{D) is going to be introduced in the last 
paragraph to account for uncertainty in the calibration of 
the Tully-Fisher relation. All those probabilities are com- 
puted assuming the model parameters p. We will establish 
the likelihood function £(p) in three steps: 

- First, the error distributions linked to observations are 
considered to get an unbiased distance estimator for groups. 
This analysis yields the probability P(/ii|/i, cr^,i,p). 

- Second, the errors on reconstructed velocities are con- 
sidered to compute P(v|3,p)- 

- Last, the two analyses are merged as given above to 
produce the likelihood function which gives the posterior 
distribution of (3 and the Hubble constant H. 

A picture of the above Bayesian chain is given in Fig. |19| 

Distance modulus error distribution - To establish the 
likelihood function comparing the measured distance to the 
reconstructed velocity field, we assume the distance cata- 



logues are obtained using the inverse TF relation (Shaya 
etaI][T995 |, 

ryO(M) = -e(M + P>), (13) 

where M is the absolute magnitude of the considered galaxy, 
?7o(M) is its predicted linewidth, e is the slope, and D is 
the zero point calibration (the latter two are assumed to be 
known exactly) . It is known that inverse TF is less sensitive 
to the selection bias as compared to forward TF (Strauss 



&: Willick^^l995) . Observational data show that the differ- 
ences between the predicted linewidth 77° (M) and the mea- 
sured linewidth 77 for an obje ct of absolute magnitude M are 
Gaussian distributee!^ ( ^Pizagno et al.|2006 Tully & Pierce 



Though it is in theory possible to avoid this hypothesis, it is 
in practice highly difficult for computational reason as one would 
need to run several MAK reconstructions to evaluate the extra 
integral that would be needed in Eq. ( |ll| ). 

In fact, in writing Eq.(|14|, two effects are mixed: the error on 
the measurement of linewidth, which may reach 10% because of 
the uncertainty in galaxy inclination correction, and the intrinsic 
modeling errors of the TF relation itself. 
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Figure 18. This figure gives the evolution of the scatter distribution and of the measurement of flm using it while more and more 
observational effects are added to 8k-mock6 catalogue. All measurements of flm are given between brackets and are sorted as follows. 
For measurements obtained through the likelihood analysis, labelled by £, the first number corresponds to £min and the second to £max- 
For measurements obtained using the 1.5(7 method, the first number corresponds to Smin? then Smed finally Smax- The last (lower 
right) panel uses the full likelihood function of Eq. ( |31| >. All others use a restricted likelihood analysis with ao/e = 0, which is nearly 
equivalent to using Eq.(|30| for each (vr,i,ipr;i) pair. The 1.5a isocontour has been plotted with a thick dashed line in the last panel. 



2000). Thus, the probability of measuring the linewidth 77, 



given that the object has an absolute magnitude M, and 
assuming that the TF relation rf{M) is known, is 



P{r]\M,e,D) 



1 



27rcr^(M) 



(14) 



with arj{M) the linewidth estimation error for the absolute 
magnitude M. Distance catalogues are composed of esti- 
mated distance moduli fie from the inverse TF relation. 
These estimated distance moduli are built from the statistics 
on a single group. Therefore, the joint probability of having 
a galaxy in a group with both a linewidth 77 and an absolute 
magnitude M, assuming the TF relation 77° (M), is: 



P{r], M|e, D) = F{M)P{r]\M, e, D) 



(15) 



where F{M) is the normalized absolute luminosity function 
of the group. 

The estimator for the distance modulus is given by: 



lie^m- Me{r]) = M + /io(r) + D' + ^ , 



(16) 



where D' and e' are the estimated inverse TF parameters of 
Eq. (13) and /io(^) = 51og(Y^) the true distance modulus 
of the considered group. The conditional probability that the 
estimated distance modulus for the group is /i, assuming 
that the estimated TuUy-Fisher parameters are e and D' 
and that the real parameters for this group are e and D, 



Note that the selection function is assumed to be indepen- 
dent of T] and is hence absorbed in F{M).F{M) corresponds to 
^(M)S(M,r]) inlStrauss & Willick|(|1995|) notation, e.g. eq. (188). 
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Figure 19. Maximum likelihood analysis - This sketch illustrates the bayesian chain used to establish the likelihood function. The input 
data are located on the left and the output posterior distribution P(/3, H) on the right. 



can be written as 



P(/i|/io (r), e, e, D, D') = (fo (/i - /ie))group 



/ 



e'F{M) 



6 / / group 
(e^(/xo(r)-^)+e^P^-eP + (e^-e)M)^ 



dM . 



(17) 



While working with the inverse TF relation, one can assume 
that the slope e is completely determined and e = e. Since 
the observed an{M) varies little with M, it is chosen to be 
equal to a constant ctq . The previous probability reduces to 



27rcro 



(18) 

Though the slope e is well determined, the zero-point 
calibration D may still be affected by non-negligible er- 
rors[^The set describing errors on distance moduli is thus 
6 = {cro,i/e, . . . ,cro,Ar/e}= {cr^,i, . . . , cr^,Ar}. The error on 
this calibration will affect the distances globally. As a first 
approximation we model the error on the zero point by a 
Gaussian centered on D with a standard deviation of ctd. 
Linking distance modulus to velocity - The second prob- 



ability function in Eq. (12) is P(9Jtr |V, 3,p), which is ac- 
tually a distribution linking the velocities and redshifts to 
distance modulus. This principally corresponds to a change 
of variable and we give directly the expression of it, which 
is inspired by Eq. ([T]): 

P(9Jt.|V,3,p) = 

N 

Y[ i/lO^^'^/^fo (^z^ - - 10 pc X iflO^^'^/^) (19) 
Reconstructed velocity distribution - We are now going 



16 The latest cahbration is given in Tully et al. ( 2007| > 



to establish the expression of P(i;|3,p) with the vector of 
parameters of our chosen model p — (/^^, /3, P^,, cr-j;, e) — 
av and 7* are going to be introduced in the next immediate 
paragraphs. One may decompose P{v\'^,p) that way 



Piv\3,p)= f P(?;|q3,p)P(<p|3,p)d<p, 



(20) 



with ^ = {'0r,i} the reconstructed displacements. As MAK 
reconstruction is deterministic once p has been assumed 
(§ [4|, the second probability distribution is simply given 
in our case by 



Pm5,P) = ll5u{^r.i-^i (3,/?)) 



(21) 



with ipi representing the MAK reconstructed displacement 
of the i-th object, being a function of all redshift coor- 
dinates and p. Thus, studying P(V|3,p) reduces to ex- 
amine P(V|q}(po),pO, with p' = {H,f3\B^,a^,j,), po = 
(if, /3o, Pt;, cr^, 7*), f3o being the assumed growth factor to 
compute the set ^{po) using the redshift reconstruction. 
P(V|3,p) and P(V|^(po),y) equalizes only if p = p' = pQ. 
Thus one needs a several redshift reconstructions to build 
the probability function P(V|3,p). Working with the inter- 
mediary set ^ is easier than with 3, we thus put the reduced 
likelihood function: 

4o(p) = 

J I P(9Jt|9Jt.,6)P(9Jt.|V,3,p)P(V|q}(po),P ) dOJt.dV 

(22) 

and we are going to establish the expression of the elemen- 
tary probability function P{vr\il^r^p) which will yield 



(23) 



assuming statistical independance of all {t'r,i, '0r,i} duets, 
and that ^(po) is obtained using a redshift reconstruction 
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for which [3 — f3Q. £^ may be written in a factorized way: 

Y[ P{lJ^i\lJ^r,(TO,i/e)P{lJ.r\Vr,Zi,p)P{Vr\lpr,i,p) dflrdVr . 

(24) 

The computation of £^ is clearly helped using this factor- 
ized form. We may now concentrate on the third probability 
function of the above equation. 

As has been established in §[2] the distribution of errors 
on the reconstructed velocity field is the Lorentzian 

Pr>E{e^pv) oc ^ (25) 



1 + 



(*)• 



where By = 86 km s~"^(redshift reconstruction), with e^v 
the distance between the reconstructed velocity /3^r and the 
true velocity Vr. This formulation is different from saying 
that the reconstructed velocity is affected by error when 
compared to the true velocity, and permits some errors in 
the MAK reconstructed displacement field. As has been seen 
in § |5.3| the reconstructed velocities may also contain an 
extra offset that needs to be removed while measuring (3. 
The error distance e^v is thus 



with 



al+(3l = 1 and (3 = 



(26) 



(27) 



and 7* to account for a potential spurious offset in recon- 
structed velocities. From linear theory (Peebles||1980), we 



know that the line-of-sight component of the velocity field 
must be distributed like a Gaussian function. We now as- 
sume that the absolute probability for an object to have a 
velocity v is given by a Gaussian distribution: 



P^el{v\p) = 



27^(7^, 



(28) 



It must be noted that it is likely that the observational data 
does not encompass a sufficiently large volume so that mea- 
sured velocities follow this law. Moreover, this prior is of 
some importance when we have to deal with highly scat- 
tered data. The shortcomings of such an approach will be 
discussed in the next section. One can recover the standard 
uniform prior on velocities by taking the limit av +oo in 
the next equations. Assuming Gipv , as a random variable, is 
independent of Vr and these two quantities are themselves 
statistically independent from p and 7* , we may now write 
the joint probability of reconstructing ipr, having a true ve- 
locity Vr'. 

= f3^PBE{e^v\Bv,av)x 

P{p,-f,\B^,a^)P{vr\B^,a^)P{i;r\B^,av) 



■ p,C{B^,a^)P{(3,-f,\B^,a^) 



P{i/jr\Bv,av)e 



1 + 



(29) 

where C is a function eventually depending on By and ay. 



The conditional probability that the true velocity is Vr given 
the reconstructed displacement ipr is now exactly 



P{Vr\tl^r,p) 



1 + 



) 



Jv- 



1 + 



( ^*^--^";"+^* ) j d^; 



(30) 



The denominator of the right hand part of this equation 
must be computed numerically^ It can be shown that, 
in the limit av +cxo, P(t'r |'0r, /^*, 7*) reverts to a pure 
Lorentzian form. 

Merging the probability distributions - We may now es- 
tablish the "elementary" conditional probability for an ob- 
ject i to get a measured distance /Ji given that its recon- 
structed displacement is ipr,i, its redshift is the error on 
the linewidth measurement is cro,i and the model parameters 
are p' in the notation of this section: 

P{fIi\i;r,i{po), Z^, (70, ^, D\p) 
= J J P{fJ^i\fJ^r,Cro^i,D',p)P{fIr\v,Zi,p) 



X P{v\tl;r,i{po),p) dfirdv 



xP(v={z^- 10 pc X H10^^^^)\i;rAPo),p) d/i., (31) 

with p — (H, f3' , Bv,o-v:J^): 'ipr,i{po) being computed as- 
suming the parameters po. Looking closely at this proba- 
bility, one may notice that changing D ^ D' — D -\- A is 
equivalent to changing H ^ H' — ifexp(A/5). Thus the 
uncertainty in the zero point calibration translates only in 
an uncertainty on H and not on the parameters of the model. 

We may now write the full formal expression of £^ (p) , as 
already sketched in Eq. ( 22 ) . As specified in the discussion 



we take P{D') to be a Gaussian distribution centered on D 
and with a standard deviation ctd . Now we may replace and 
get: 



J D' 



{D-p'y 

e ^'^D ]^P(/ii|'0^,i(po),2;i,cro,i,e,_p^) dD , 



(32) 



with i running on objects of the catalogue. Assuming a uni- 
form prior on P, H and 7* and taking care of the relation 
between 2^ and £ as mentioned above, the Bayes theorem 
permits us to write 



P(if, /3, 7* 6, 3, e, P., a.) oc £(p) = ^(p) 



(33) 



We now have access to the posterior distribution of 



This function is known as a Voigt profile. 
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6.3 Results 

The results of measuring Qm using the maximum likelihood 
estimator are presented in the tables using the label £. 

Except in the case where we consider observational er- 
rors, we use a simplified version of £ by taking cro,i = 0. 
While it would have been natural to find the maximum of 
the likelihood for all parameters (including By, cr^, 7*), we 
quickly noticed that it was leading to unacceptably biased 
measurements and to an unnecessary increase of the pa- 
rameter space. Moreover, the results quite strongly depends 
on av and 7*, especially when the reconstruction noise be- 
comes high as in redshift reconstructions (see Appendix [p]). 
We thus propose to discuss the values obtained by setting 
7* = 0, Bv = 90 km s~^and choosing two values for av 
First, we use linear theory to predict the average veloc- 
ity dispersion of haloes in the universe, this leads to take 
ay = 326 km s~^(the Qm measured that way is labelled 
Ximax). Second, av = +00 is used to check the influence of 
recovering a uniform prior on the velocity distribution (la- 
belled £min, respectively). 

By looking at all tables of this paper, we noticed that 
the difference between the two measured is mostly fol- 
lowing the interval defined by Smin and Smax- We were ex- 
pecting such a behaviour {av is more or less controlling the 
statistical bias of the likelihood function) but not that it 
would so clearly follow the other method. The more the 
scatter is important, the more the measurement becomes 
imprecise as expected. It must however be noted that on 
average the measure /omax suffers less systematic bias than 
£niin. This behaviour is supported by the tests conducted in 
Appendix |D] 

The seemingly well estimated Qm in the lower right 
panel of Fig. [18] has been computed using the full likelihood 
analysis. Actually, compared to the 1.5a method for which 
the measured slope is undefined, £min and £max are basically 
the same as when no observational errors are introduced. 

The correction based on a Gaussian velocity distribu- 
tion assumption, cannot be entirely trusted for 4k-mock7 
and 4k-mockl2. As one may note in Fig. |17| the velocity 
distribution is highly non-Gaussian in these cases. This ren- 
ders incorrect our distribution modeling in § |6.2| Looking at 
Table [6] we note that though the measurements on "Orig- 
inal" reconstruction is not strongly affected, we cannot say 
the same thing using data obtained from "Full" reconstruc- 
tion. In the first case, the noise is sufficiently low so that 
the prior does not have much importance whereas in the 
second case the wrong modeling of the velocity distribution 
leads to a strong error on the measured Qm- Fortunately, 
the scatter distribution presents different types of properties 
that lead to compatible measurements in Table [5] between 
the maximum likelihood {av = +00 to remove the Gaussian 
prior) and the 1.5a method. For 4k- mock? and 4k-mockl2, 
the slope estimate is helped by probing velocities with high 
magnitudes, leading to less possibility of systematic error on 
the slope. 

One is thus led to use a sufficiently deep distance cata- 
logue to ensure the velocity distribution is more or less Gaus- 
sian to be able to apply the correction to the likelihood anal- 
ysis. In this case, one may rely on the value given by £max- 
If on the contrary, the velocity distribution is highly non- 
Gaussian, one must use £min- If possible, a visual inspection 



of the velocity-velocity scatter plot must be conducted to 
give a check on the amount of statistical biasing. 



CONCLUSION 

The Monge-Ampere-Kantorovitch method has been applied 
with success to reconstruct the velocity field and the den- 
sity field of simulations ( [Mohayaee et al .||2006|), providing 
an interesting tool to apply to galaxy catalogues in order 
to recover the dynamics of our local universe. This method 
presents the interesting advantage of finding the exact solu- 
tion of an approximated dynamical problem written in La- 
grangian coordinates. The Lagrangian description presents 
two major advantages. First, it gives a real estimation of 
peculiar velocities for each galaxies or groups of galaxies, as 
opposed to a field description which would give an average 
value at a given spatial position (which is also possible to 
build using the Lagrangian description). Second, it permits 
us to use the Zel'dovich approximation, which gives better 
peculiar velocity prediction than linear Eulerian theory ap- 
plied to the same dark matter density field. It means that we 
expect this method to give better results and more spatially 
resolved than, e.g., the POTENT method ( Bertschinger fc| 



Dekel||1989 ) or velocity field reconstruction through spheri- 
cal harmonics jRegos Szalay||l989| . Now, most previous 
analyses of Lagrangian peculiar velocity reconstruction have 
been run mostly on particle catalogues coming from simu- 
lations. However, galaxy catalogues are not as simple, and 
the main problems are as follows: 

(i) Catalogues mostly provide redshift positions of galaxies 
and for a few objects their physical distances from us. 

(ii) The luminosity is the only known "dynamical" quantity 
for most objects in catalogues and so we need extrapolate 
the M/L relation for known objects to the ones that we do 
not know. 

(iii) Incompleteness effects have to be taken into account: 
either because of magnitude limitation or due to extinction 
of objects by the galactic plane. 

(iv) The MAK reconstruction also needs the Lagrangian do- 
main of the galaxy catalogue. 

All these biases and unknown quantities render the recon- 
struction problem much more difficult than in simulations. 
We propose here both to test the feasibility of such a recon- 
struction on galaxy catalogues and the methods to overcome 
the problems that we have just cited. We have tried to ad- 
dress the following problems: 

- Reducing the introduced systematic errors due to un- 
known bias between mass and luminosity tracers. The dark 
mass can be either put uniformly into the catalogue or put 
in the detected haloes (§ 3.1). It appears that there exists 



an optimum way to distribute the mass, as can be seen in 
Fig. [5] which gives unbiased and noiseless reconstructed ve- 
locities, even though the exact location of 63% of the mass 
in the universe remains unknown. In addition to the previ- 
ous, global, problem, the relative mass distribution between 
objects in the catalogue is also uncertain as we do not know 
their true M/L. The induced systematic errors have been 
studied in § |3.2| and we show that the naive approach corre- 
sponding to using M/L = constant inevitably gives a large 
bias on reconstructed velocities. Even a reasonable guess, for 
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instance the one proposed by Marinoni & Hudson ( 2002 ) , is 



still significantly biased. This suggests some more work must 
be done on the M/L relation, particularly on the high mass 
end. However, on the positive side, large random errors on 
M/L does not yield any systematic effect and only increases 
the scatter in the velocity- velocity comparison. 

- We proposed a slightly improved way to correct for in- 
completeness effects in galaxy catalogues and its effect on 
reconstruction. Though it has given good results, we do not 
expect this method to be completely bias-free as it presents 
the same deficiencies as the previous item. However, by en- 
forcing the correction on the mass distribution, we managed 
to preserve the dynamics in the observational data in a bet- 
ter way than would be the case if we had enforced it on the 
luminosity distribution. 

- We investigated the eventual systematic errors in red- 
shift reconstructions as proposed previously by Moh ayaee fc| 
Tully (2005) and which corresponds to the inverse redshift 



operator studied by Valentine et al. (2000). It appears that. 



though the bias is small, Qm tends to be always underesti- 
mated. 

- Two solutions to overcome the Lagrangian volume un- 
certainty for the case of finite volume catalogues have been 
investigated. The reconstruction method which gives better 
result seems to be PaddedDom. The other alternative, Naive- 
Dom, appears to bias the reconstructed velocities, especially 
in the case of a redshift reconstruction. 

- The efficiency of the correction for the zone of avoidance 
as proposed by Shaya et al. ( 1995 ) has been checked (§ 5.1 ). 
It appears that the correction is bias free and only introduce 
a small, but noticeable, additional noise for objects in the 
direction of the zone of avoidance. 

- We checked that the resulting errors of each effect are 
uncorrelated so they only pile up without producing a strong 
additional bias. It is fortunate that some observational ef- 
fects produce complementary biases: incompleteness effect 
tends to overestimate whereas redshift distortion under- 
estimates Qm- The resulting bias is thus not so important. 

- We finally tried two estimators to measure Qm from 
both reconstructed displacement and distance measurement 
(§|6|: the 1.5cr and the maximum likelihood estimator. How- 
ever, the first one is not able to work with noisy measured 
velocities, and the second one is badly affected by large dis- 
tribution tails in redshift reconstruction. Adding a prior on 
the distribution of velocities in the catalogue helps to reduce 
the bias at the cost of having a good measurement of the 
width of this distribution. A good estimate of is thus 
rendered more problematic though we have shown that it 
should be feasible in principle. 

We intend to continue this work in the following direc- 
tions 

- This method can be applied to make a measurement 
of Qm in NBG-8k/NBG-3k catalogues and in the upcoming 
6dFGS redshift and distance catalogues. 

- A better comparison to the acoustic peaks of the CMB 
can potentially be obtained using the reconstructed displace- 
ment field (Eisenstein et al.|[2006) . 

- We can apply MAK reconstruction on SDSS and 
2MASS catalogues to obtain the initial Lagrangian posi- 
tions and velocities of objects in our local universe. This 
would render the possibility of a re-simulation of our local 



universe for the first time and check the MAK prediction 
and correction schemes on real observations. 

- We want also to improve the reconstruction itself and 
propose a new algorithm to include further gravitational ef- 
fect during orbit reconstructions. This will never give us the 
internal structure of objects but potentially will give bet- 
ter reconstructed velocities while keeping the power of the 
MAK reconstruction. 
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fii such that the quantity 



2 _ 



niTTiR — Mi 
Mi 



is minimized given the constrain 
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where N is the total number of nodes on the MAK grid, such 
that A^MAK X ruR is as close as possible to the total mass, 
5^ Mi . The minimization of is performed iteratively until 
convergence. Note that the solution of such a minimization 
is, in general, not unique due to the possible permutations 
between objects of the same mass. Due to this degeneracy, 
it is needed to shuffle randomly the tracers prior to the min- 
imization in order to avoid possible systematic effects. 

Note finally that one must make sure that there is at 
least a few particles per tracer, m ^ a with a > 1. This 
brings constraints on mn and therefore on the size of the 
MAK mesh. Unfortunately, it is not always possible to have 
a > 1 due to the prohibitive CPU cost it would imply for the 
MAK reconstruction in the present paper. To address this 
problem, we separate the catalogue into groups of galaxies 
and field galaxies. For the groups, the minimization is 
performed as explained above, with a possible loss of the 
lightest ones since rii can still be smaller than unity. For 
the field galaxies, we use a simpler procedure as follows. 
Given the mass Mi of a galaxy i, a MAK tracer is randomly 
assigned to it with occurence probability Mi/iriR. 



APPENDIX B: TOOLS FOR ERROR ANALYSIS 

To check the accuracy of the reconstructions, we compute 
the moment of the joint probability distribution of the re- 
constructed velocities Vrec,i of object i and the simulated 
velocities of those objects Vsim,i- We write (A) the average 
of the quantity A 



(Bl) 



APPENDIX A: CONSTRUCTION OF A MAK 
MESH 

MAK reconstructions requires a sampling of the matter dis- 
tribution with "particles" of equal mass corresponding to 
nodes of an homogeneous mesh. When considering the sim- 
ulation, one uses a full periodic cubic mesh. However, in 
real galaxy catalogues, the relevant lagrangian volume is a 
non-periodic compact subset inscribed in a larger rectan- 
gular mesh. In that case, the assignment is performed only 
for "particles" belonging to this initial volume. Note that 
the determination of this initial volume is by itself a great 
challenge and a poor guess can have dramatic consequences. 

Given a number of "galaxies" , or tracers, for which the 
individual masses Mi are known and a choice of the mass 
resolution of the MAK grid, m^, the problem is now to de- 
termine how many "particles" have to be assigned to tracer 
i. This number should be rii = Mi/iJiR which is rarely an in- 
teger. To address this issue, we construct an integer function 



We define three second moments (after substraction of 
the average): 



('i^rec'^sim) 
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From these moments we can build the correlation coefficient: 
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and the ratio between the width of the reconstructed field 
PDF (density or velocity) and the width of simulated - mock 
- field PDF 



(B4) 



For these two quantities the optimum value is 1. Alterna- 
tively two other "slope" estimator of the reconstructed ve- 
locities versus the simulated ones can be built from the above 
momenta 



sr and Sn 



■ s/r. 
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These two slopes are interesting when one makes an esti- 
mation of Qm through s and needs an evaluation of the un- 
certainty. The two extra slopes determined using this way 
should, ideally, be equal to s but due to the lack of perfect 
correlation (r < 1), they are actually different from it in 
realistic cases. In fact, we have Smin < Smed < Smax- 

Please note that we can define the relative dispersion 
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which is a measure of the noise-to-signal ratio: high a cor- 
responds to low signal. Ideally, one wants cr = 0. 



APPENDIX C: SIMULATING 
MAGNITUDE-LIMITED CATALOGUES 

Having only a halo catalogue, we must generate a "galaxy 
catalogue" including incompleteness effects. The main diffi- 
culty in that construction is that the distribution of galax- 
ies in the universe is a non-trivial, non-linear functional of 
the total matter density field. For instance, bright galax- 



ies tend to concentrate in massive structures (Zandivarez 
eraL][2006t . It means that, though most of the field galax- 
ies are missed, the major groups can still be easily seen due 
to the bright galaxies they contain. Thus the galaxy distri- 
bution should mostly trace large haloes at large distances, 
potentially introducing a bias in the reconstructed peculiar 
velocities if incompleteness corrections are performed un- 
wisely. In what follows, we generate mock galaxy catalogues 
like NBG-8k/3k. To take properly into account the effects 
discussed above, we separate groups of galaxies from field 
galaxies. Groups are populated with galaxies following the 
universal Schechter form for simplicity, but with a diflFerent 
normalization to account for their non-trivial M/ L. 

Statistically, NBG-8k/3k catalogues are composed of 
galaxies measured in the B band and distributed according 
to the Schechter form 



(CI) 



with L* 5.7 X 10^° L© and no ^ 0.03 /i^Mpc"^. Moreover, 
the NBG-8k catalogue is complete above 3 x 10^ —4 x 10^ L© 
inside a sphere of radius dcomp = 12 /i~^Mpc. As the 
mean "galaxy" (particle) density in the simulation is nsim = 



0.26 /i^Mpc-^ 



and about ricat = 0.08 /i^Mpc ^ 0.30nsin 



in NBG-8k, we must dilute the simulation to get a mock cat- 
alogue similar to NBG-8k. The luminosity Lq of a detected 
galaxy at a distance d from the observer must satisfy the 
constraint 



Lg > 4:7Tlcutd 
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with /cut the minimum fiux detectable by the observer. The 
fraction of galaxies detected at the distance d in the galaxy 
mock catalogue is thus 



{0.30 if d < 

n(L)dL 



otherwise 



(C3) 



with /cut the minimum fiux detectable by the observer. The 
fraction is saturated at 0.30 to follow the dilution constraint 
expressed above. We enforce the continuity of /field (c/) by 

choosing Lmin such that /field (c^comp) = 0.30. 



The mock galaxy and group of galaxies catalogue is now 
built: 

I. We take a halo A from FullMock and assume it is a 
group of galaxies. We thus deduce the intrinsic luminosity 
La from the mass Ma of this object using Eq. ([s]). 
II. The observed luminosity L'a of A is computed assuming 



that its galaxy population follows (CI) but with a different 
normalization to achieve the intrinsic luminosity La- If c/a 
is the distance between the observer and the halo A, then 



the galaxies detected in this halo verify (C2) for d — dA- 
The total observable luminosity for A is thus 



l'a = LAfhidA) 

with, assuming Lmin ^ L^ 
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d < c/c, 
d > dn, 
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III. If L'a < 4:7Td\lcut then A is removed from the catalogue, 
otherwise it is kept. 

IV. This gives us the group component of our magnitude- 
limited catalogue. 

V. The case of the "field galaxies" is treated separately. 
Galaxies are identified with dark matter particles and their 



luminosity is assigned following (CI). More specifically, we 
choose a shell Sd put at a distance d from the observer. 
The probability of keeping a "galaxy" G in Sd is given by 



(C3). Inside the shell Sd, the selected "galaxies" share now 



luminosity 



Li{d) 
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Ln{L) dL 
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which is distributed evenly among them. Strictly speaking, 
such a repartition should be performed randomly according 



to (CI). That would add a small additional noise on the 



reconstructed velocities. This noise should be of insignificant 
consequence as supported by the discussion of the TS- T case 
in §[3:21 



We have now a realistic mock catalogue and we try to 
account for its incompleteness as we would for NBG-8k: 

A. The missing luminosity in groups is corrected. In order 
to do this, we compute, in a thin shell Sd at some distance 
c/, the ratio between the expected total luminosity and the 
observed luminosity 
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/,°° LnjL) dL 
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The intrinsic luminosity La of a group A in Sd is recovered 
with 



La — Lohs,Ab{d). 



(C8) 



The mass Ma of A can then be obtained using the non-linear 
relation 

B. The remaining missing mass in Sd can be written 



Mmissed,d = Tb{d) (ivfield,obs,d + -t^group,obs,(i ) 
A/field,obs,d -^^group,obs,(i 7 



(C9) 
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Figure CI. Magnitude limitation/ Filling missing mass — This 
plot gives the measured amount of mass in a thin sheh at different 
distances from the observer. The sohd hne gives the original mass 
distribution in the simulation, the dot-dashed line the mass dis- 
tribution after mimicking incompleteness and the dashed line the 
recovered mass distribution after correction for incompleteness as 
described in Appendix [C] 



with T = 93^^ the average M/Lj^ Lgroup,obs,cZ the ob- 
served luminosity of groups, Mgroup,obs,(i the masses of 
groups obtained after the above correction, Lfieid,obs,cZ the 
luminosity of field galaxies. The quantity Mniissed,d comes 
from both missing galaxies and missing group of galaxies. 
If Mniissed,d > and without any further information, the 
missing mass may either be assigned evenly to field galaxies 
of Sd (our choice, as usually performed in the litterature) , 
or distributed uniformly in Sd using new random tracers. If 
Mniissed,d ^ 0, the mass distribution in Sd is untouched. 

This procedure is certainly not free from biases. For in- 
stance, the contrasts between shells are partly smoothed out, 
as illustrated by Fig. |C1| This is equivalent to reducing the 
overall magnitude of fluctuations in the density field. As a 
result, a small bias towards larger Qm might occur, as in 
the lower right panel of Fig. [5] of § |3.1| On the opposite, if 
the missing mass is assigned to detected background galax- 
ies, the estimated Qm is expected to underestimate the true 
value as discussed in § |3.1| 

C. Note that the mass of the "field galaxies" is not the 
mass of a single particle anymore. Procedure explained in 
Appendix [a] is facilitated as follows, for the sake of algorith- 
mic simplicity. With v ^ (0; 1] a uniform random variable, a 
galaxy G of mass mo is splitted into no subcomponents of 
mass TTiparticie such that: 
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with re = — — , \x\ being the integer part of x. Each 

|_ ''^particle J 

of the subcomponent is now considered as a "field galaxy" 
in the procedure explained in Appendix [A] 



Note that a prior assumption on the value of Q^rn is obviously 
needed to estimate T. 
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Figure Dl. Statistical bias - Scatter plot of 10,000 randomly 
generated points following the approximated probability laws 
found between reconstructed velocities and simulated velocities. 



APPENDIX D: STATISTICAL BIAS IN THE 
SLOPE ESTIMATION 

The two methods that we used for slope estimation are 
known to be biased. A more precise treatment of this bias 
is beyond the scope of this paper. However we propose here 
to check the order of magnitude of the systematic effect of 
the statistical analysis itself. To achieve this we produced a 
set of randomly generated "velocities" v and their "recon- 
structed velocities" vr counterpart. The probability for a 
point {v,v-r) to have a velocity v is given by 

1 -V(2.?) pi) 



PvM 



with Gv — 300 km s~^ typically. The probability for it to have 
a reconstructed velocity vr is given by the same probability 
law. We now compute the error e between vr and which 
must be distributed according to the Lorentzian form 



(e) 
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with B = 86 km s"\ The error e is related to v and vyi by 

e = a^v — f3^v-R. (D3) 

For the rest of the appendix we take a* = /3* = 1/^2. The 
probability of keeping a point {v,vr) with an error e is given 
by, integrating PDE(e^) between — e and +e, 

2 
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tan" 



(D4) 



We represented in Fig. |Dl| a scatter plot of 10,000 points 
generated using this procedure. As one can see, it does look 
like a real scatter plot of a redshift reconstruction. 

Conducting a 1.5a analysis on this set of points, we 
find a slope = 1.0 ± 0.20. In our case, this would 

give Qm = 0.30 ± 0.10. Estimating the slope using the max- 
imum likelihood approach gives, with av = +cxo, jS^/a^ — 
0.81 ± 0.01 = 0.20 ± 0.02) and with = 300 km 

s-\ = 1.074 ±0.012 {^rn = 0.34 ± 0.02). Putting 

B — km s~^, both for generated data and likelihood 
function, as for real space reconstructions, reduces the error 
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and gives Qm = 0.31 =b 0.02, thus highlighting the impor- 
tance of the reconstruction noise for a good estimation of 

Consequently, though one must rely on the likelihood 
analysis, it may be strongly biased by the structure of re- 
construction errors mixed with the non-uniform distribution 
of observables. We tried to make a good approximate model 
of the errors, though it seems to quite depends on the value 
of ay. Whenever possible, of course, one must crosscheck the 
result of the likelihood by a visual inspection of the scatter 
plot. 



